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Abstract 

In  the  last  few  years,  there  seems  to  have  been  a  resurgence  of  interest  in  the 
membership-set-theoretic  approach  to  parameter  estimation.  This  report  concentrates  on 
the  optimal  bounding  ellipsoid  (OBE)  approach  to  membership-set  parameter  estimation, 
with  emphasis  being  placed  on  the  performance  of  one  particuhir  ORE  algorithm  in  non¬ 
ideal  conditions.  It  is  first  shown  that  OBE  algorithms  offer  distinct  advantages  over 
commonly  used  recursive  parameter  estimation  algorithms  like  the  recursive  least-squares 
(RLS)  algorithm  in  some  real-life  environments.  Then  the  e.xtension  of  a  particular  OBE 
algorithm  to  the  problem  of  parameter  estimation  with  unobservable  but  bounded  inputs 
(ARMA  parameter  estimation)  is  discussed  in  some  detail.  The  problem  is  important 
because,  in  many  signal  processing  applications,  the  inputs  to  the  system  under 
consideration  are  unknown.  Analysis  of  the  extended  algorithm  shows  that  under  some 
conditions,  the  extended  algorithm  yields  100%  confidence  intervals  fur  liie  paiameters  at 
every  sampling  instant.  This  feature  does  not  appear  to  be  present  in  any  other  existing 
ARMA  parameter  estimation  algorithms.  Furthermore,  the  transient  performance  of  this 
algorithm  is  observed  to  be  superior  to  that  of  the  extended  least-squares  algorithm.  Finite 
precision  effects  of  one  of  the  OBE  algorithms  are  also  studied  via  analysis  of  error 
propagation  in  the  algorithm  and  through  simulationsf'The  analysis  shows  that  the 
algorithm  is  stable  with  respect  to  errors  due  to  finite  word-length  computation  and 
storage.  Simulation  results  demonstrate  the  superiority  of  the  algorithm  to  the 
conventional  recursive  least-squares  algorithm  for  small  word-lengths.  Finally,  analysis 
of  the  tracking  characteristics  of  one  of  the  OBE  algorithms  is  performed.  It  is  shown  that 
the  algorithm  is  capable  of  tracking  small  time  variations  in  the  parameters.  Since  large 
variations  may  cause  the  algorithm  to  fail,  a  rescue  procedure  is  proposed  which  can 
enable  the  algorithm  to  also  track  large  time  variations.  Simulation  results  demonstrate 
that  the  tracking  capability  of  the  algorithm  is  comparable  to  that  of  existing  adaptive 
filtering  algorithms. 

*This  report  is  a  reproduction  of  Ph.D.  dissertation  of  Ashok  K.  Kao,  Department  of 
Electrical  and  Computer  Engineering,  University  of  Notre  Dame,  Notre  Dame,  Indiana, 
August,  1989.  This  work  has  been  supported,  in  part,  by  the  National  Science 
Foundation  under  Grant  MIP  87-1 1 174  and  ,  in  part,  by  the  Office  of  Naval  Research 
under  Contracts  N00014-87-k-()284  and  N00014-89-J-1788 
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CHAPTER  I 


MEMBERSHIP-SET  PARAMETER  ESTIMATION 


1.1  Introduction 

System  identification  deals  with  the  formulation  of  mathematical  models  of  dynamical 
systems  based  on  input  and  output  data  records.  The  formulation  of  a  model  is  usually 
done  in  two  stages  [Ljung,  1987].  In  the  first  stage,  if  the  dynamics  of  the  unknown 
system  are  known,  then  a  set  of  candidate  models  (the  model  set)  can  be  specified.  In 
other  cases,  standard  model  sets  (black  box  models)  can  be  used  without  reference  to  the 
actual  system  dynamics.  The  model  set  itself,  is  just  a  mathematical  relationship  between 
the  system  variables.  It  could  be  in  continuous  time  or  discrete  time.  It  could  also  be 
characterized  as  being  linear  or  nonlinear,  deterministic  or  stochastic.  The  model  set 
usually  contains  unknown  quantities,  vvhich  are  termed  the  unknown  parameters  of  the 
model.  The  next  stage  of  system  identification  uses  the  input-output  record  and  other 
information  to  obtain  the  'best'  model  from  the  model  set  by  choosing  the  unknown 
parameters  appropriately.  This  stage  of  identification  has  been  termed  parameter 
estimation  in  the  literature.  In  the  classical  approach,  once  the  system  model  (with  an 
unknown  parameter  vector  9*)  has  been  formulated,  then  a  predictor  model  (with  an 
adjustable  parameter  vector  9)  is  formed,  which,  yields  at  every  instant  of  time,  a 
prediction  of  the  system  output,  based  on  past  information.  The  predicted  output  is  a 
function  of  the  parameter  vector  9. Then,  a  criterion  of  fit  is  chosen.  Perhaps  the  most 
widely  used  criterion  is  the  mean  squared  prediction  error  criterion,  where  the  prediction 


error  is  the  error  between  the  system  output  and  the  predictor  model  output.  Once  a 
criterion  of  fit  is  chosen,  the  parameter  estimate  is  chosen  to  be  that  value  of  9  which  best 
fits  the  criterion.  Thus  least-squares  parameter  estimates  minimize  the  mean  or  average 
squared  prediction  error. 

System  identification  forms  the  core  of  most  adaptive  signal  processing  and  adaptive 
control  techniques.  In  telephony,  for  example,  echo  cancellation  is  required  for  long 
distance  speech  communication  [Messerschmitt,  1984].  In  the  transmission  of  speech, 
echos  arise  primarily  due  to  leakage  through  the  far  end  hybrid.  Speech  echo  cancellers 
usually  model  the  hybrid  as  a  FIR  filter  and  adjust  the  parameters  of  the  predictor  filter 
(also  a  FIR  filter)  to  minimize  the  mean-squared  error  between  the  echo  and  the  predictor 
filter  output.  In  high  resolution  spectral  estimation,  the  signal  of  interest  is  usually 
assumed  to  be  a  sum  of  sinusoids  in  noise.  This  signal  is  often  modeled  as  the  output  of 
an  HR  filter  driven  by  white  noise.  The  parameters  of  the  filter  are  then  estimated  and 
used  to  construct  an  estimate  of  the  spectrum.  In  adaptive  control,  a  model  of  the 
unknown  plant  is  first  formed.  A  common  model  used  is  the  ARX  model  described  in  the 
next  section.  The  estimated  model  parameters  are  then  used  by  the  controller  to  generate 
the  control  signal. 

Most  system  model  sets  incorporate  a  disturbance  term  which  can  represent 
observation  noise  or  modeling  uncertainty.  This  noise  term  is  usually  assumed  to  be  a 
stochastic  process.  Some  statistical  estimation  schemes  such  as  maximum  likelihood 
estimation  require  precise  knowledge  of  the  probability  density  function  of  the  noise.  The 
simpler  least-squares  schemes  require  the  noise  to  be  white  in  order  to  obtain  unbiased 
parameter  estimates.  If  the  noise  is  not  white,  then  unbiased  estimates  can  be  obtained  by 
modeling  the  noise  term  as  a  linear  regression  process.  This  approach  is  used  in 
extended  least-squares(ELS),  generalized  least-squares  (GLS),  recursive  maximum 


likelihood  (RML)  [Ljung,  1983],  output  error  memods  [Goodwin,  1984],  The 
convergence  analysis  of  all  these  methods,  however,  does  require  that  the  noise  be  a 
stationary  stochastic  process. 

As  opposed  to  classical  approaches  to  parameter  estimation  which  yield  point 
estimates  of  parameters  by  optimizing  some  criterion  of  fit,  membership-set  parameter 
estimation  (MSPE)  algorithms  yield  a  set  of  parameter  vectors  which  is  compatible  w'ith 
the  model  structure,  observation  record  and  noise  constraints.  In  general,  no  know'ledge 
of  the  statistics  of  the  noise  process  is  assumed.  However,  the  noise  is  assumed  to  be 
constrained  in  some  other  way,  e.g.,  with  bounded  energy  [Fogel,  1979]  or  bounded 
magnitude  [Fogel,  1982].  Membership-set  algorithms  are  thus  preferable  when  the  noise 
is  too  structured  as  in  the  case  of  error  occurring  when  a  large  order  system  is  modeled 
by  a  lower  order  model.  MSPE  algorithms  yield  100%  confidence  internals  for  the 
parameter  estimates  at  every  time  instant.  In  contrast,  confidence  intcrv'als  for  least- 
squares  parameter  estimates  can  be  obtained  only  asymptotically  in  most  cases.  Another 
important  feature  of  recursive  MSPE  algorithms  is  a  discerning  update  strategy  whereby 
only  a  fraction  of  the  incoming  data  points  need  be  used  to  construct  the  membership 
sets.  This  not  only  reduces  the  total  processing  time  but  also  enhances  the  potential  for 
using  these  algorithms  in  multi-channel  environments.  For  every  observation  which  is 
processed,  recursive  MSPE  algorithms  can  also  indicate  if  the  observation  is  consistent 
with  the  model  and  noise  bounds.  Thus,  the  presence  of  outliers  or  large  modeling 
inaccuracy  can  be  detected  quite  easily,  in  contrast  to  other  parameter  estimation  schemes. 

In  this  report,  attention  will  be  focussed  on  the  estimation  of  membership  sets  of 
parameters  of  linear  discrete  time  difference  equation  models.  In  the  remainder  of  this 
chapter,  an  overview  of  some  membership-set  algorithms  will  be  provided.  In  Chapter 
a  particular  class  of  membership-set  algorithms  called  optimal  bounding  ellip.soidal  (OBE) 
estimation  algorithms  is  studied  at  some  length.  The  algorithms  in  this  class  obtain. 
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recursively,  ellipsoidal  outer  bounds  of  the  membership  sets  of  parameters.  Tliough  the 
membership-sets  obtained  are  often  larger  than  those  obtained  by  using  other  algorithms, 
the  bounding  ellipsoid  algorithms  have  the  advantages  of  low  compuiational  complexity, 
analytical  tractability,  and  robustness  to  parameter  viu'iations  and  finite  precision  effects. 
In  Chapter  III,  a  particular  bounding  ellipsoid  algorithm,  i.e..  the  DHOBE  algorithm  of 
Dasgupta  and  Huang  [Dasgupta,  1987],  is  used  for  the  estimtition  of  parameters  of 
systems  with  unobservable  inputs.  The  performance  of  the  algorithm  is  analyzed  and 
sufficient  conditions  for  satisfactory  behavior  are  derived.  In  Chapter  IV,  finite  precision 
effects  in  the  DHOBE  algorithm  are  studied.  It  is  shown  that  the  algorithm  remains  stable 
(algorithmic  variables  remain  bounded)  in  finite  word-length  environments.  Simulation 
results  show  that  the  performance  of  the  algorithm  is  superior  to  the  recursive  least- 
squares  algorithm  when  the  word-length  is  small.  The  tracking  propeny  of  the  DHOBE 
algorithm  is  studied  in  Chapter  V.  Conditions  under  which  the  algorithms  can  track 
variations  in  parameters  are  derived.  Modifications  to  the  algorithm  are  suggested  which 
improve  the  tracking  at  the  expense  of  increasing  the  size  of  the  outer  bounding 
ellipsoidal  approximation  to  the  membership  set. 

The  overview  of  membership-set  parameter  estimation  algorithms  commences  with  an 
enumeration  of  the  different  model  structures  which  are  commonly  employed  in 
parameter  estimation. 

1.2  .Model  Structures 

The  various  types  of  model  structures  which  are  considered  in  this  report  are  defined 
below. 

MA  model:  A  moving  average  model  is  defined  by 


y(t)  =  bou(t)  +  biu(t-l)  bm  u(t-n)  -t-  v(t) 


(1.2.1) 


where  y(t)  and  u(t)  denote  the  system  output  and  input  at  time  instant  t,  respectively  and 
v(t)  denotes  the  disturbance  at  time  t. 

AR  model:  An  autoregressive  model  is  defined  by  the  following  difference  equation 

y(t)  =  aiy(t-l)  +  a2y(t-2)  +...+  any(t-n)  +  v(t)  (1.2.2) 

where  y(t)  and  v(t)  are  as  defined  above. 

.ARX  model;  An  autoregressive  v'ith  exogenous  input  model  is  defined  by 

y(t)  =  aiy(t-n+  a2y(t-2)H..+  any(t-n)  +  buu(t)-t-  biu(t-n+..+b,„  u(t-m)  +-  \(t)  ( 1.2.3) 

where  y(t),  u(t)  and  v(t)  are  defined  above. 

.ARM.A  model  An  autoregressive  moving  average  model  is  defined  by 
y(t)  =  aiy(t-n+  a2  y(t-2)+..+  an  y(t-n)  +  w(t)-t-  cq  w(t-l)i-..+Cr  w(t-r)  (1.2.4) 

where  w(t)  is  a  unobservable  input  or  noise  sequence  and  y(t)  is  the  system  output. 

In  the  statistical  literature,  v(t)  and  w(t)  are  assumed  to  be  zero  mean,  white  stochastic 
processes.  However,  since  the  approach  used  in  this  report  is  deterministic,  these 
restrictions  will  not  be  imposed.  It  will  be  assumed,  instead,  that  v(t)  and  wit)  are 
bounded  in  magnitude.  This  assumption  is  quite  realistic  in  practice.  For  example, 
observation  noise  which  arises  due  to  quantization  or  round-off  error  is  bounded,  as  is 
the  error  in  measurements  due  to  climatic  effects.  Similarly,  the  error  due  to  model 


mismatch  is  often  bounded. 
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1.3  Membership  Sets 
Definition;  (Membership  set) 

Membersh’p  set:  A  membership  set  is  a  region  in  the  parameter  space  which  contains  all 
the  parameter  vectors  consistent  with  the  model  structure,  observation  record  and  noise 
bounds. 

The  ARX  model  which  was  defined  in  the  last  section  is  a  commonly  used  model 
structure,  since  it  can  model  linear  time-invariant  systenis  wiiich  have  poles  ai'.d  zeros. 
The  ARX  system  model  can  be  expu  ssed  as 

y(t)  =  0*T(I)(t)  +  v(t)  (1.3.1) 

with 

0*  =  [aj,  a2,...,  an,  bo,  b},  b^  (1.3.2) 

and 

^(t)  =  fy(t-l),  y(t-2), ..,  y(t-n),  u(t),  u(t-l),  ..,utt-m)lT  (1.3.3) 

It  IS  assumed  that  the  noise  {v(t))  is  constrained  in  some  way.  For  simplicity,  it  is 
assumed  that  {v(t)}  is  bounded  in  magnitude.  Hence  there  exists  a  known  positive  y, 
such  that 

lv(t;l<y  (1.3.4) 

Given  {y(t),  u(t),  t  =  1,2,..,  T),  the  goal  of  membership-set  parameter  estimation  is 
to  find  the  smallest  set  of  parameter  vectors  \\i(T)  in  euclidean  space  R^,  where  N  = 
n-t-m-t-1,  which  is  consistent  with  all  the  above  equations  (1.3. 1)  to  (1.3.4). 

From  (1.3.1)  and  (1.3.4),  it  follows  that 


I  y(t)  -  0*TcD(t)  I  <  y 


(1.3.5) 
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So  the  set  S(  enclosed  by  the  hyperplanes 

Hi(t)  -  {  0  G  RN  ;  y(t)  -  =  +y  |  and  H2(t)  =  |0  e  R'"^' ;  yit)  -  0^O(t)  =  -y  ) 

contains  the  true  parameter.  The  S(  can  be  described  as  the  intersection  of  two  half 
spaces.  Given  the  observation  record  up  to  time  instant  T.  the  smallest  possible 
membership  set  will  be 

\l/(T)  =  n  S 
1=1 

Finding  a  description  of  the  exact  membership  set  is  an  arduous  task,  since  it  may  have 
thousands  of  vertices  for  even  small  values  of  the  order  N  and  data  record  size  T.  For 
example,  when  N=3.  the  number  of  vertices  could  be  as  large  as  'ftT-l  )+2  |Co\eter. 
1973].  Thus  tor  T=1()0,  the  membership  set  could  have  almost  10. 000  vertices. 
Consequently  most  MSPE  algorithms  attempt  to  obtain  a  region  of  simpler  shape  like  an 
ellipsoid  or  a  box  which  contains  the  true  membership  set  \|/(T).  Recently,  however,  two 
exact  polytope  bounding  algorithms  have  been  proposed  [Walter,  1987;  .\lo,  1988], 
which  recursively  yield  an  exact  description  of  the  true  membership  set.  The  next  section 
discusses  the  exact  polytope  updating  algorithm  of  Mo  and  Norton  and  the  exact  cone 
updating  algorithm  of  Walter  and  Lahanier. 


1.4  Exact  Polytope  Bounding 

Even  though’  9'’orcti.ally,  the  true  membership  set  may  have  thousands  of  vertices 
as  the  numbe  .■  '  ua  points  processed  increases,  in  practice,  once  the  intersection  of  the 
first  few  half  spa '  s  formed,  the  membership  set  becomes  quite  small  and  so  only  a 
small  fraction  of  the  incoming  half  spaces  affect  the  membership  set.  Hence  the  number 
of  vertices  of  the  membership  set  increases  quite  slowly  as  the  number  of  data  points 
increases.  This  provides  the  motivation  for  developing  algorithms  which  identify  the 
precise  shape  of  the  membership  set. 


For  the  membership  set  at  any  instant  t,  the  exact  polytope  updating  algorithm  (e.p.u. 
algorithm)  of  [Mo,  1988],  stores  a  list  of  all  the  vertices  in  terms  of  their  components. 
For  each  vertex,  a  list  of  all  adjacent  vertices  (the  vertex-vertex  list),  and  the  hyperplanes 
which  intersect  and  form  the  vertex  (the  vertex-plane  list)  are  also  stored.  When  a  new  set 
of  parameter  bounds  (the  set  Si)  arrives,  a  test  is  made  to  see  how  Si  intersects  the 
existing  membership  set  V)/(t-l).  If  the  intersection  is  void,  this  indicates  that  either  the 
model  structure  or  noise  bound  is  incorrect.  If  Si  contains  v(t-l),  then  \|/(t)  =  \)/(t-l),  and 
the  parameter  bounds  provided  by  Si  are  redundant.  If  the  intersection  is  not  void  and  if 
Si  does  not  contain  \|/(t-l),  then  \|/(t-l)  has  to  be  updated.  This  involves  (i)  calculating  the 
new  vertices  formed  when  Fli(t)  and/or  Hoft)  cut  \|/(t-l),  and  creating  vertex-vertex  and 
vertex-plane  adjacency  lists  for  the  newly  formed  vertices;  (ii)  updating  the  vertex-venex 
and  vertex-plane  lists  of  the  old  vertices  which  belong  to  \]/(t-l)  o  Si,  and  (iii)  removing 
the  vertices  made  redundant  by  Si . 

The  exact  cone  updating  (e.c.u.)  algorithm  of  Walter  and  Lahanier  is  similar  in  many 
ways  to  the  e.p.u.  algorithm.  It  transforms  the  membership  set,  which  is  a  convex 
polyhedron  in  ,  into  a  polyhedral  cone  in  Thus  the  membership  set  at  time  t-1 
is  represented  by  a  cone  C[.i.  The  extreme  rays  of  Ci.]  are  stored  as  columns  of  a  matrix 
M.  When  the  parameter  bounds  due  to  the  t’th  observation  are  applied,  then  as  before, 
three  situations  can  arise.  Either  Q.i  n  Si  =  <|>,  or  St  3  Q.i,  or  one  of  the  hyperplanes 
Hl(t)  or  K2T)  cut  Cm.  In  the  latter  case,  the  extreme  rays  of  C[  will  be  those  of  Cm 
lying  in  the  set  St  ,  and  new  rays  belonging  to  St  n  Cm.  Each  of  these  new  rays  is 
obtained  as  a  linear  combination  of  two  adjacent  extreme  rays  of  Ci.i  lying  on  either  side 
of  the  cutting  hyperplane.  Finally,  constraints  that  become  redundant  at  this  stage  are 
eliminated. 

The  computational  complexity  of  both  these  methods  is  quite  large,  even  for  small 
order  models.  The  order  of  complexity  at  each  iteration  also  grows  slowly  as  new  data 
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points  are  processed,  since  the  number  of  vertices  can  increase  with  every  new  sample. 


For  these  reasons,  they  are  probably  not  suitable  for  real  time  processing  applications.  It 


is  claimed  in  [Mo,  19881,  that  the  e.p.u  algorithm  is  superior  to  the  e.c.u.  algorithm  in 

1 

I  terms  of  numerical  robustness.  Numerical  results  have  only  been  reported  for  the  e.p.u 


I  algorithm,  applied  to  one  ARX  parameter  estimation  problem,  and  it  remains  to  be  seen, 

how  well  the  algorithms  perform  in  practice. 


1.5  Orthotope  Bounding 

A  popular  approach  in  membership-set  parameter  estimation  is  to  approximate  the 
membership  set  by  an  orthotope  (a  rectangular  box)  which  contains  the  membership  set. 
This  has  the  added  advantage  of  giving  accurate  uncertainty  intervals  for  each  of  the 
unknown  parameters.  A  membership-set  description  in  terms  of  parametric  uncertainty 
intervals  (PUI's)  on  the  individual  parameters,  is  often  preferable  to  a  description  of  an 
extremely  complicated  N-dimensional  region.  This  is  particularly  true  when  the 
parameters  have  a  direct  physical  interpretation.  It  has  been  shown,  [Milanese,  1982], 
that  the  lighitst  possible  PLT's  can  be  obtained  by  linear  programming.  Specifically, 
.Milanese  and  Belforte  showed  that  the  problem  can  be  reduced  to  solving  2N  linear 
programming  problems  in  N  variables  with  2T  constraints,  where  N  is  the  order  of  the 
model  and  T  is  the  number  of  data  points.  Their  algorithm,  which  is  termed  the  minimum 
uncertainty  interval  correct  estimator  (MUICE),  is  not  recursive  and  computationally 
intensive.  Consequently,  it  is  not  suited  for  real  time  applications  or  to  the  analysis  of 
nun-stationary  data.  A  simple  recursive  algorithm  which  constructs  outer  bounding 
orthotopes  has  been  proposed  [Huang,  1980].  However  this  algorithm  does  not  yield 
small  enough  orthotopes  for  large  order  systems.  Another  recursive  algorithm  which 
constructs  outer  bounding  orthotopes  has  been  proposed  recently  [Pearson,  1986).  The 
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PUI's  obtained  by  using  the  algorithm  are  not  as  tight  as  the  PUI's  of  the  MUICE, 
although,  they  are  much  easier  to  evaluate. 

It  is  clear  from  the  discussion  above,  that  both  the  exact  polytope  bounding  and  the 
orthotope  bounding  algorithms  involve  considerable  amounts  of  computation.  In  fact,  the 
computational  complexity  of  the  exact  polyiope  bounding  algorithms  depends  heavily  on 
the  characteristics  of  the  data.  Thus  these  algorithms  are  not  suited  for  real  time 
applications.  In  the  next  chapter,  several  ellipsoidal  bounding  algorithms  will  be 
discussed,  all  of  which  have  the  advantageous  features  of  low  computational  complexity 
and  analytical  tractability.  The  latter  feature  has  simplified  the  application  of  the 
algorithms  to  different  cases  such  as  ARMA  parameter  estimation  and  output  error 
parameter  estimation,  and,  has  enabled  its  implementation  via  lattice  filters  and  systolic 
arrays.  All  this  and  much  more  coming  up,  so  don’t  go  away! 


CHAPTER  II 


BOUNDING  ELLIPSOIDAL  PARAMETER  ESTIMATION 

2.1  Introduction 

The  optimal  bounding  ellipsoid  algorithms  outer  bound  the  membership  set  of 
parameters  by  ellipsoids  in  the  parameter  space.  The  idea  of  using  ellipsoids  to  bound 
^ets  was  originally  proposed  by  Schweppe  ISchweppe,  1967],  in  the  context  of  state 
estimation.  He  formulated  a  recursive  algorithm  for  completely  specified  dynamic 
systems,  with  unknown  but  bounded  inputs  and  bounded  observation  errors.  At  every 
instant,  the  algorithm  yields  ellipsoidal  sets,  which  contain  the  true  time  varying  system 
state.  The  state  estimate  can  be  taken  to  be  the  center  of  the  ellipsoid.  The  algorithm 
differs  from  the  Kalman  filter,  developed  for  linear  dynamical  systems  with  gaussian 
inputs  and  noise,  only  in  the  gain  sequence.  Following  Schweppe,  Fogel  proposed  a 
recursive  algorithm  for  calculating  ellipsoidal  outer  bounds  of  the  membership  sets  of 
parameters,  assuming  energy  constraints  on  the  noise  [Fogel,  1979].  By  imposing 
instantaneous  bounds  on  the  noise,  Fogel  and  Huang  [Fogel,  1982]  came  up  with 
membership-set  algorithms,  wherein,  the  size  of  the  bounding  ellipsoids  is  optimized 
according  to  different  criteria.  A  by-product  of  the  optimization  procedure  is  a  discerning 
update  strategy  which  makes  efficient  use  of  data  [Fogel,  1982].  Based  on  their  work, 
different  bounding  ellipsoidal  algorithms  have  been  proposed  in  the  past  few  years.  In 
this  chapter,  the  optimal  bounding  ellipsoid  (OBE)  algorithms  of  [Fogel,  1982]  are 
presented  first.  Then  in  Section  2.3,  a  more  recent  OBE  algorithm,  the  DHOBE  algorithm 
[Dasgupta,  1987],  which  uses  a  slighdy  different  ellipsoidal  formulation  and  optimization 
criterion,  is  discussed  at  some  length.  An  analogy  between  weighted  least-squares  and 


II 


the  OBE  algorithms  is  drawn  in  Section  2.4.  Some  simulated  situations  in  which  the 
performance  of  the  OBE  algorithms  is  seen  to  be  markedly  superior  to  the  recursive  least- 
squares  algorithm  are  presented  in  Section  2.5.  Finally,  in  Section  2.6,  an  improvement 
to  the  OBE  algorithms  of  Fogel  and  Huang  iBelforte.  1985]  is  discussed  and  the 
improved  algorithm  is  compared  with  the  other  OBE  algonthms  via  simulations. 

2.2  The  OBE  Algorithms 

As  mentioned  earlier,  the  OBE  algorithms  of  Fogel  and  Huang  [Fogel,  1982], 
recursively  obtain  ellipsoidal  outer  bounds  to  the  membership  set.  The  model  structure 
considered  is  the  ARX  model  of  (1.3.1) 

y(t)  =  0*4'<h(t)  +  v(t)  (2.2. 1 ) 

where  0*  ,  the  true  parameter  vector,  and,  d)(t),  the  regressor  vector,  are  N  dimensional 
vectors  given  by  (1.3.2)  and  (1.3.3)  respectively.  The  noise  v(t)  is  assumed  to  be 
uniformly  bounded  in  magnitude 

lv(t)l<Y  (2.2.2) 

In  order  to  develop  a  recursive  formulation  for  the  bounding  ellipsoids,  it  is  assumed  that 
at  time  instant  t-1,  the  true  membership  set  is  outer  bounded  by  the  ellipsoid  Et.i 
described  by 

Eh  =  {0  e  RN  :  [9  -  e(t-l)  f  P-l(t-l)  [0  -  e(t-l)]  <  1  }  (2.2.3) 

where  P'l(t-l)  is  a  positive  definite  matrix,  and  0(t-l)  is  the  center  of  the  ellipsoid.  At 
time  instant  t,  observation  y(t),  defines  a  set  Si ,  which  is  a  degenerate  ellipsoid  in  the 
parameter  space 


St  =  [0  e  RN  :  [y(t)  -  0T<I)(t)]2  <  y2  } 


(2.2.4) 
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From  (2.2.1)  and  (2.2.2)  it  is  clear  that  Si  contains  the  true  parameter  vector.  An  ellipsoid 
El  which  contains  the  iniersecnon  of  Ei.j  and  St  is  then  given  by 

Et=  iOe  RN:  [9-0(1-!)  ]T  p-I(t-{)  [0  -  0(t-l)J  +  A,  [y(t)  -  eTcD(t)]2 

<l+XtY“}  (2.2.5) 

where  At  is  a  positive  time  varying  updating  gain  which  is  chosen  to  minimize  the  size  of 
the  ellipsoid  Et.  By  performing  some  tedious,  but  straightforward  algebraic  manipulation, 
it  can  be  shown  that  Et  can  be  expressed  in  the  form 

Et  =  (0  6  RN  ;  [0  -  0(t)  ]T  P'kt)  [0  -  0(t)J  <  1  }  (2.2.6) 

where 

Q-l(t)  =  P-kt-l)  + AtO(t)cDT(t)  (2.2.7) 


P-Ht)  =  <j2(t)  Q-l(t) 


(2.2.8) 


a2(t)  =  1  +  Xt  y2  - 


Xt[y(t)-cpT(t)e(t-i)]2 

l+>,i<I>'P(t)P(t-l)0^(t) 


(2.2.9) 


e(t)  =  9(t-l)  +  Q(t)a)(t)[y(t)-OT(t)0(t-l)] 


(2.2.10) 


The  matrix  inversion  required  in  (2.2.10)  can  be  circumvented  by  using  the  matrix 
inversion  lemma  in  (2.2.7),  which  yields 


Q(t)  =  P(M)-;it 


P(t-l)cI)(t)cDT(t)P(t-l) 
1+  XtCDT(t)P(t-l)OT(t) 


(2.2.11) 


In  order  to  ensure  that  the  initial  ellipsoid  Eq  contains  0*,  Eq  is  taken  to  be  a  large  ball 


centered  around  zero,  i.e.  P(0)  =  M.I,  where  M  is  a  large  number  and  I  is  the  N  by  N 
identity  matrix,  and  0(0)  =  0.  If  an  initial  estimate  of  0*is  available,  then  0(0)  could  be 


set  to  that  value. 
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In  the  minimal  volume  sequential  (MVS)  algorithm,  at  every  time  instant  t,  the 
determinant  of  P(t),  which  is  proportional  to  the  volume  of  the  enclosing  ellipsoid  E[ ,  is 
minimized  with  respect  to  This  yields  the  following  formula  for  the  gain  factor 


If  2N[7 

2  -  6-(t)  ]  -  G(t)  >  0,  then  At  =  0 

(2.2.12a) 

with  6(t) 

=  y(t)-cI)T'(t)e(t-l) 

otherwise 

,  -a2  +  V «'>“  -  4a]a3 

At  - 

(2.2.12b) 

2ai 

where 

ai  =  (2N-1)  Y^G^lt) 

(2.2.12c) 

ctl  =  G(t)  [(4N- 1  )y  2  -  G(t)  +  52(t)] 

(2.2.12d) 

a3  =  2N[Y2-52(t)]  -G(t) 

(2.2.12c) 

and 

G(t)  =  OT(t)P(t-l)OT(t)) 

(2.2.12f) 

Note  that  when  =  0,  Et=  Et-i,  i.e.  0(t)  =  9(t-l)  and  P(t)  =  P(t-l).  The  evaluation  of 
Xi  is  thus  the  basis  of  a  discerning  update  strategy,  whereby,  the  "  innovativeness  "  of  the 
observation  pair  {y(t),  0(t)}  is  checked  in  (2.2.I2a).  An  update  (  At  0)  occurs  only  if  it 
is  possible  to  construct  an  ellipsoid  Et,  which  bounds  the  intersection  of  Et-i  and  St ,  and 
whose  volume  is  less  than  Ei.i  . 

The  computational  complexity  of  the  ellipsoid  updating  formula  (2.2.8)-(2.2.1 1)  is 
O(N^),  which  is  same  order  of  computational  complexity  as  that  of  the  RLS  algorithm. 
For  the  discerning  update  strategy  ,  the  major  contributor  to  the  computational  cost  is  the 
computation  of  G(t)  =  OT'(t)P(t-l)cI)T^(t),  which  takes  (N^  +  N)  multiplications  and 
(N+1)(N-1)  additions. 


Instead  of  choosing  /.( to  minimize  the  volume,  it  can  be  chosen  to  minimize  the  sum 
of  the  semi-axes  of  the  bounding  ellipsoid  Et.  This  is  achieved  by  minimizing  the  trace  of 
the  matrix  P(t).  Tlie  resulting  minimum  trace  sequential  (MTS)  algorithm  has  a  different 
update  strategy  .  In  case  of  an  update,  in  order  to  find  the  optimum  updating  gam  factor, 
the  positive  root  of  a  cenain  third  order  polynomial  has  to  be  found  [Fogel,  1982],  The 
MTS  algorithm,  therefore,  has  a  higher  computational  cost  than  the  MVS  algorithm. 


2.3  The  DHOBE  Algorithm 

The  updating  gam  factors  of  the  MVS  and  MTS  algonthms  of  the  above  section  are 
chosen  to  minimize  the  size  of  the  bounding  ellipsoid.  This  is  no  doubt  desirable,  when 
the  pai  ameters  of  the  unknown  system  are  fixed.  However,  if  in  case,  the  true  parameter 
changes  after  the  the  ellipsoid  has  shrunk,  it  is  possible  that  the  resulting  bounding 
ellipsoid,  if  it  exists,  will  no  longer  contain  the  true  parameter  and  hence  it  will  not  be 
possible  to  track  the  true  parameter.  Thus  from  the  point  of  view  of  tracking  time  varying 
parameters,  it  may  not  always  be  advantageous  to  minimize  the  size  of  the  bounding 
ellipsoids. 

The  motivation  for  the  development  of  the  DHOBE  algorithm,  stems  more  from  the 
point  of  view  of  adaptive  filtering  and  prediction  error  minimization,  rather  than  from 
membership-set  parameter  estimation.  This  accounts  for  the  similarity  between  the 
DHOBE  algorithm  and  some  bounded  error  algorithms  proposed  in  the  adaptive  control 
literature  [Fortescue,  1981;  Ortega,  1987].  The  quantity  which  is  minimized  in  the 
DHOBE  algorithm  is  a  cenain  upper  bound  on  the  normalized  parameter  estimation  error. 
This  yields  an  updating  rule  which  has  a  computational  complexity  of  only  (N-t-1) 
multiplies.  Furthermore,  minimizing  with  the  above  criterion  greatly  enhances  the 
analytical  tractability  of  the  algorithm.  Analysis  shows  that  the  a  priori  prediction  error  is 
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asymptotically  bounded  by  the  bound  on  the  noise.  Additionally,  a  degree  of 
uncorrelatedness  between  the  inputs  and  the  noise  is  sufficient  for  asymptoric  cessation  of 
updates  in  the  fixed  parameter  case.  Tlie  updating  gain  factor  in  the  algorithm  also  plays 
the  dual  role  of  a  forgetting  factor.  This  improves  the  tracking  capability  of  the  algorithm 
vis  a  vis  any  of  the  MSPE  algorithms.  On  the  flip  side,  since  the  size  of  the  bounding 
ellipsoids  is  no  longer  the  criterion  for  optimization,  the  DHOBE  algorithm,  in  general, 
yields  bounding  ellipsoids  which  are  larger  than  those  yielded  by  other  OBE  algorithms. 
The  above  remarks  will  become  clearer  after  the  following  discussion. 

The  sequence  of  optimal  bounding  ellipsoids  in  the  DHOBE  algorithm  is  developed 
as  follows.  Let  the  bounding  ellipsoid  at  time  instant  t-1  be 

Et-i  =  {9  €  RN  ;  [9- e(t-i)  ]Tp-i(t-l)  [0-e(t-l)]  <a2(M)  )  (2.3.1) 

where,  as  in  the  previous  section,  P'*(t-1)  is  a  positive  definite  matrix,  and  9(t-l)  is  the 
center  of  the  ellipsoid.  The  factor  a^ft-l)  is  a  positive  time  varying  scalar,  which  along 
with  P'*(t-1)  determines  the  size  of  Et-i.  Since  the  true  parameter  9*  €  Et-i,  a^ft-l)  can 
also  be  thought  of  as  being  an  upper  bound  on  the  normalized  parameter  estimation  error 

V(t-l)  =  (0*-0(t-l)  ]T  P->ft-l)  [0*-9(t-l)l 

An  ellipsoid  Et  which  bounds  the  intersection  of  Et.i  and  the  set  Si  ,  where  St  is 
described  by  (2.2.4),  is  then  given  by 

Et=  {9g  RN;  (i-Xt)[0-0(t-l)  P-l(t-l)  [0-9(M)]  +Xt  [y(t)  -  9'fO(t)j2 

<(l-Xt)a2(t-l)  +  XiY2}  (2.3.2) 

Comparing  (2.2.5)  and  (2.3.2),  it  can  be  seen  that  in  (2.3.2),  >.t  is  an  updating  gain 
factor  and  ( 1-  Xt )  is  a  forgetting  factor.  By  performing  some  algebraic  manipulations,  an 
expression  for  Et  can  be  obtained  as 
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(2.3.3) 

(2.3.4) 

(2.3.5) 

(2.3.6) 

(2.3.7) 


The  initial  conditions  are  chosen  to  ensure  that  9*  €  Eq.  A  possible  choice  is 

P(0)  =  I,  and  0^(0)  =  l/e^  where  e  «  1 . 

The  updating  gain  A.t  is  chosen  to  minimize  a2(t)  at  every  instant  t.  The  minimization 
procedure  yields  the  following  updating  criterion 

If  a2(t-l)  +  62(t)  <Y^  then  Xt  =  0  (No  update)  (2.3.8) 


where  the  a  priori  prediction  error 

5(t)  =y(t)-OT(t)0(t-l)  (2.3.9) 

Otherwise  if  a^It-l)  +  52(t)  >7^,  then  the  optimum  value  of  is  non-zero  and 
calculated  according  to 

Xt  =  rnin(a,Vt)  (2.3.10) 


where 
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r  ^ 

l-(3(t) 


if  52(tj  =  0 
ifG(t)  =  1 


(2.3.11) 

(2.3.12) 


Vt  = 


■[1 


l-G(t) 

+P(t)[G(i)-l 


Tim 


']  if  l+p(t)lG(i)-ll  >0  (2.3.13) 


a 


if  l+p(t)lG(t)-ll  <0  (2.3.14) 


where  a  is  a  user  chosen  upper  bound  on  satisfying 


luid 


and 


0  <  a  <  1 


G(t)  =  OT(t)P(t-l)OTft) 


P(t)  = 


y2-  g2(t- 1 ) 
52(t) 


1 2.3. 1 5) 

(2.3.16) 

(2.3.17) 


The  main  results  of  the  convergence  analysis  of  the  DHODE  algorithm  [Dasgupta. 
1987]  are 

( i)  { a^(t) )  is  a  non-increasing  sequence  and 

lim  a2(t)  =  where  0  <  a2  < 

(ii)  The  eigenvalues  of  P(t)  are  upper  and  lower  bounded  provided  the  inputs  are 
sufficiently  rich  in  frequency  and  sufficiently  uncorrelated  with  the  noise,  i.e.  if  there 
exist  Pi,P2  >  0.  such  that  for  some  M  and  all  t 

t+M 

Pll<  £w(t)WT(t)  <p2l  (2.3.18) 

i=t+n 


where 
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W(t)  =  uit-n,..,u(t-n-m),  v(t),  vd-l). 


with  n  being  the  order  of  the  AR  pan,  and  m  being  the  order  of  the  exogenous  pan; 
then  there  exist  tt],  ut  >0,  such  that  for  all  t 

ai  I  <  P(t)<  a:  I 

iiii)  If  Git)  is  bounded,  i.e  if  the  conditions  of  (2.?.  18)  hold  then. 

I  a)  The  a  nrion  prediction  enors  are  bounded 


d-(t)  |().  1 

.U  tU 

’  b)  The  parameter  estimate  difference  converges  to  zero.  i.e. 

II  0(t)-0(t-r)II  -4() 

!  iv)  If  the  conditions  of  (2.3.18)  hold  and  0*  is  constant,  then 

lim  /.  —  0 

t 

c— >oo 

.A  detailed  derivation  of  these  convergence  results  and  the  pert'ormance  of  the  algonthm  ir. 
computer  simulations  can  be  found  in  [Dasgupta.  1987]. 

Looking  at  the  update  equatioco  of  the  OBE  algorithms,  it  is  clear  that  the  algonthms 
lue  similar  to  the  weighted  recursive  least-squares  (  WRLS)  algonthm.  The  next  section 
^hows  that  the  OBE  algonthms  are  in  fact  special  cases  of  the  V.’RLS  algorithm. 

2.4  Weighted  Least-Squares  and  the  OBE  Algorithms 

The  conventional  weighted  least  squares  algorithm  (Ljung,  19831  minimizes  the 
entenon 

,  T 

Vt  (0)  =  yX®!  ~  0’^O(i)]2 

i=l 


(2.4.1) 
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The  need  for  weighting  the  observations  can  arise  if.  for  example,  the  noise  variances  are 
time  varying  and  in  that  case,  ttf  can  be  taken  to  be  inversely  proportional  to  the  variance 
of  the  noise.  In  fact,  this  choice  of  a(  is  optimal  when  the  noise  is  uncorrelated  and 
gaussian,  i.e.,  the  resulting  parameter  estimates  have  the  minimum  covariance  over  all 
possible  choices  of  the  forgetting  factor.  If  in  addition,  it  is  required  to  discount  older 
values,  then  the  following  criterion  can  be  used 

T 

V't  (0)  =  yX  P(T.i)Iy(i)  -  eTO(i)]-  (2.4.2) 


where 


f3(T.i)  =  ;riT)p(T-l.i),  l<i<T-l 


(2.4.3) 


which  can  also  be  wntten  as 


ttj  ,  with  p(i,i)  =  ttj 

j=i+l 


(2.4.4) 


The  value  of  0  which  minimizes V'-p  (0),  can  easily  be  computed 


0(T)  = 


2^p(T.i)0(i)cD^i) 


^P(T.i)(D(i)y(i) 


1=1 


(2.4.5) 


A  recursive  form  for  0(t),  can  be  easily  developed  by  usuig  (2.4.3)  and  (2.4.5) 

0(t)  =  0(M)  +aiP(t)cI)(t)[y(t)-OT(t)e(t-l)]  (2.4.6) 

where 

P-*(t)  =  ?r(t)P-i(t-l)  +  ai(I)(t)OT(t)  i2.4.7) 


Thus  the  parameter  vector  0(t)  described  by  (2.4.6)  and  (2.4.7)  minimizes  V\(0). 
Comparing  (2.4.6)  and  (2.4.7)  with  (2.3.4)  and  (2.3.6)  of  the  DHOBE  algonthm.  it  is 
apparent  that  At  in  (2.3.6)  plays  the  same  role  as  at  in  (2.4.6)  and  (2.4.7).  Also  ( 1-  ) 
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in  (2.3.4)  i.s  equivalent  to  A.{t)  in  (2.4.7).  Thus  the  DHOBE  algorithm  minimizes  the 
cnterion 

=  :}:Sq.Tly(i)  -  (2.4.8) 

‘  1=1 

where 

T 

qi.T  =  [  nCl-Xj)]  A.i ,  and  Qi  i  =  ?ii  (2.4.9) 

j=i+l 

Thus  the  DHOBE  algorithm  can  be  described  as  a  weighted  recursive  least-squares  with 
forgetting  factor  algorithm,  with  the  weight  cXi  and  forgetting  factor  A.(t)  given  by 

/qt)  =  1-  At  1  tind  at  =  /it 

.Alternatively,  the  DHOBE  algorithm  can  be  described  as  a  weighted  recursive  least- 
squares  algorithm  with  non-causal  weights 

at  =  qt,T 

For  the  MVS  or  MTS  OBE  algorithms,  it  is  easy  to  show  that  the  corresponding  relations 
are 

^(t)=  — ^ —  and  ai  =  -^ 
a2(t)  a2(t) 

Thus  these  algorithms  are  also  special  cases  of  the  WRLS  with  forgetting  factor 
algorithm. 

One  implication  of  such  a  relation  between  the  OBE  algorithms  and  the  WRLS 
algorithms  is  that  it  facilitates  application  of  many  of  the  ideas  and  concepts  from  the 
least-squares  adaptive  filtering  literature.  For  example,  it  opens  up  the  possibility  of 
developing  exact  lattice  and  fast  transversal  filter  implementations  of  the  OBE  algorithms. 
In  fact,  this  equivalence  has  already  been  exploited  to  develop  a  systolic  array 
implementation  of  the  MVS  algorithm  [Deller,  1989). 
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Another  implication  is  that,  even  though  the  formulation  of  the  OBE  algorithms  is 
motivated  by  membership-set-theoreiic  considerations,  it  turns  out  that  the  parameter 
estimates  (the  centers  of  the  ellipsoids)  minimize  the  average  weighted  prediction  error.  In 
the  MTS  and  MVS  algorithms,  the  choice  of  the  weighting  sequence  is  motivated  by  a 
desire  to  reduce  the  size  of  the  bounding  ellipsoids.  In  the  DHOBE  algorithm,  the 
weighting  factor  is  chosen  to  minimize  an  instantaneous  upper  bound  on  the  normalized 
parameter  estimation  error.  Due  to  the  different  choices  of  the  weighting  and  forgetting 
factor  sequences,  the  parameter  estimate  trajectories  of  the  different  OBE  algorithms  and 
the  popular  exponentially  weighted  least-squares  (EWLS)  algorithm  (with  A.(t)=A  <  1, 
and  ttt  =  1 )  may  differ  significantly.  To  illustrate  this  point,  some  simulated  examples  in 
which,  the  parameter  estimation  error  of  the  OBE  algorithms  is  markedly  lower  than  that 
of  the  RLS  and  the  EWLS  algorithms  are  presented  next. 

2.5  Performance  in  Sinusoidal  and  Impulsive  Noise 

In  many  situations,  the  noise  process  affecting  the  observations  may  not  be  a  white 
noise  process,  or  may  not  even  be  stationary.  For  example,  the  noise  may  have  a  large 
sinusoidal  component,  as  in  the  case  of  observations  affected  by  electromagnetic 
interference,  and  in  the  case  of  helicopter  flight  data  [Goodwin,  1987].  In  some  cases, 
the  noise  may  be  bursty  or  impulsive,  and  thus  highly  non-stationary.  It  is  interesting  to 
compare  the  performance  of  the  OBE  algorithms  to  RLS  algorithm  in  these  situations. 
The  first  two  examples  compare  the  performance  of  the  DHOBE  and  MVS  algorithms  to 
the  RLS  algorithm  when  the  noise  is  an  additive  sum  of  white  noise  and  a  sinusoid. 

Example  2,1 

The  unknown  system  is  described  by  an  ARX(2,2)  model 


y(t)  =  -0.4  y(t-l)  -0.85y(t-2)  -0.2  u(t)  -0.7u(t-l)  +  v(t) 


The  input  u(t)  is  generated  by  a  pseudo-random  number  generator  which  is  uniformly 
distributed  in  (  - 1  ,-(-1 1.  The  noise  process  is  generated  by 

v(t)  =  A  sin  (coQt)  +  (1- A  )w(t)  (2.5.1) 

where  A,  the  amplitude  of  the  sinusoid,  satisfies  0<  A  <  1,  and  w(t)  is  white  and 
uniformly  distributed  in  [-l,-(-l].  The  frequency  a)o=7r/10.  The  amplitude  of  the  sinusoid 
is  vaned  from  0  to  1  and  for  each  value  of  A.  ten  Monte  Carlo  runs,  with  500  points 
each,  of  the  RLS  and  OBE  algorithms  are  performed.  The  estimation  error  is  measured 
by  calculating  the  final  mean-squared  parameter  estimation  error  (MSPEE)  defined  by 

■  1  N 

MSPEE  (db)  =  10 log  — T 116  (500)  -  6*11^  (2.5.2) 

iN^r  '  J 

Figure  2.1  displays  the  variation  in  MSPEE  for  the  RLS  and  OBE  algorithms  as  A  is 
varied.  The  mean-square  error  in  the  RLS  estimates  increases  drastically  as  the  amplitude 
of  the  sinusoidal  disturbance  increases.  This  is  because  the  RLS  estimates  are  biased 
when  the  noise  is  correlated  [Ljung,  1983].  The  estimates  of  the  OBE  algorithms  appear 
insensitive  to  the  amplitude  of  the  sinusoid.  Unfortunately,  one  cannot  derive  any  general 
conclusions  about  the  superiority  of  the  OBE  algorithms  in  the  presence  of  sinusoidal 
disturbances,  since  such  behavior  may  be  specific  to  the  panicular  ARX(2,2)  system 
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Figure  2. 1  Mean-squared  parameter  estimadon  error  for  the  OBE  and 
RLS  algorithms  applied  to  an  ARXf2,2)  model  with  a 
sinusoidal  disturbance 


Example  2.2 

In  order  to  make  some  definitive  statements  about  the  bias,  the  behavior  of  both  the 
algorithms  is  investigated  tor  a  simple  ARX(1,1)  model. 

y(t)=xy(t-l)  +  bou(t)  +  v(t)  (2.5.3) 

The  noise  v(t)  and  the  input  u(t)  is  as  in  Example  2.1.  The  absolute  value  of.x  is  required 
to  be  less  than  unity  to  ensure  stability  of  the  system.  An  expression  for  the  bias  in  the 
RLS  parameter  estimates  for  this  simple  model  is  derived  in  Appendix  2A.  It  is  shown 
that  if  [a,b  1^  are  the  values  of  the  parameter  estimates  that  the  RLS  algorithm  yields 
asymptotically,  i.e.  these  values  minimize  the  mean-squared  prediction  error,  then  for  the 
system  (2.5.3) 
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AS 

—  cosco 
2  ^ 

^  +  (aJ+(l-A)^ai) 

1  -2^cosa)o  +x^ 

L 

and 

b  =  ho  (2.5.4) 

where  and  Oyp-  are  the  variances  of  the  white  noise  w(t)  and  input  u(t)  respectively. 
Thus  only  the  AR  estimate  is  biased  and  the  bias  depends  on  the  noise  and  input 
variances,  the  amplitude  and  frequency  of  the  sinusoid,  and  the  true  AR  coefficient.  In 
particular,  the  bias  is  zero  when  A  =  0,  and  the  sign  of  the  bias  is  the  same  as  the  sign  of 
(coscoQ  -X  ).  Unfortunately,  a  corresponding  expression  for  the  bias  in  the  OBE 
estimates  is  difficult  to  derive  on  account  of  the  presence  of  the  data  dependent  updating 
and  forgetting  factor.  In  order  to  get  some  comparison  of  the  bias  possible  in  the  two 
algorithms,  the  value  of  A  is  now  fixed  at  1,  i.e.  the  disturbance  is  a  pure  sinusoid  with 
(OQ  =  7t/10,  and  x  ,  the  system  AR  coefficient  is  varied  from  +1  to  -1.  The  value  of  bo  is 
set  equal  to  unity,  and  the  input  u(t)  is  generated  as  in  Example  2.1.  The  asymptotic  bias 
is  computed  by  averaging  over  10  Monte  Carlo  runs  of  500  points  each.  Figure  2.2  and 
Figure  2.3  show  the  variation  in  the  bias  in  the  parameter  estimates  as  x  is  varied.  It  is 
clear  that  the  OBE  estimates  are  biased  for  many  values  of  x  .  However,  the  bias  in  the 
AR  estimate  is  significantly  lower  than  the  bias  in  the  RLS  autoregressive  estimate.  The 
values  of  the  bias  in  the  RLS  algorithm,  yielded  by  simulation  are  very  close  to  the  values 
predicted  by  (2.5.4),  thus  verifying  the  analysis  of  Appendex  2A.  Unlike  the  RLS  case, 
the  MA  estimates  yielded  by  the  DHOBE  algorithm  seem  to  be  slightly  biased. 
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Figure  2.2  Variation  in  the  bias  of  the  RLS  and  OBE  AR  parameter  estimates 
as  the  AR  coefficient  is  varied  for  an  ARX(1,1)  system  model 


DHOBE 

RLS 

Theoretical 

MVS 


Figure  2.3  Variation  in  the  bias  of  the  RLS  and  OBE  MA  parameter  estimates 


The  average  final  values  of  [a2(t)]^det  [P(t)],  which  is  a  measure  of  the  final 
volume,  and  [a^ft)]  Tr  [P(t)],  which  is  measure  of  the  sum  of  the  semi  axes,  for  the 
DHOBE  algorithm,  are  plotted  against  x  in  Figure  2.4.  Also  plotted  are  the  corresponding 
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values  for  the  MVS  algorithm.  It  appears  that  the  ellipsoids  are  large  at  the  values  of  x  at 
which  the  bias  is  significant. 


-a-  Vol.  (DHOBE) 
Tr.(DHOBE) 
-a-  Voi.  (MVS) 

->■  Tr.  (MVS) 


Figure  2.4  Average  final  volume  and  sum  of  the 
semi-axes  for  Example  2.2 


Example  2.3 


In  this  example  the  performance  of  the  OBE  and  RLS  algorithms  is  compared  when 
the  observations  are  affected  by  bursts  of  equal  amplitude  disturbances.  The  system 
model  considered  is  the  same  ARX(2,2)  model  of  Example  2.1. 

y(t)  =  -0.4  y(t-l)  -0.85y(t-2)  -0.2  uft)  -0.7u(t-I)  +  v(t) 

The  noise  process  {v(t))  is  now  generated  as  follows.  At  every  instant  t,  t=  1,2,..,  1000, 
a  random  number  w(t)  e  [0,1]  is  generated.  If  w(t)  is  greater  than  \-prob  (where  prob  is 
the  preassigned  probability  of  a  burst  occurring),  then  v(t+j),  j  =  0,1, ..4,  is  set  equal  to 
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unity.  Thus  the  noise  sequence  is  composed  of  bursts  of  impulses  of  burst  length  >  5. 
The  average  parameter  estimation  error  MSPEE,  was  computed  over  10  runs  of  1000 
points  each  for  the  RLS  (forgetting  factor  /\,=  1,  P(0)  =  10001),  EWLS  (P(0)  =  10001,  A 
=  0.99  and  X  =  0.9),  the  DHOBE  (with  a  =  0.5,  P(0)  =1.  and  o2(0)  =  100)  and  the  MVS 
(P(0)=  1001)  algorithms.  Table  2.1,  lists  the  MSPEE  values  in  dBs  for  two  different 
values  of  prob.  The  quantity  in  parentheses  in  the  first  column  is  the  average  number  of 
impulsive  noise  points  in  the  1000  point  data  records.  The  quantiries  in  parentheses  in  the 
last  two  columns  are  the  average  final  volumes  and  final  sums  of  semi-axes,  respectively, 
for  the  DHOBE  and  MVS  algorithms.  The  pert'ormance  of  both  the  DHOBE  and  the 
MVS  algorithm  is  vastly  superior  to  that  of  the  RLS  and  EWLS  algorithms  when  the 
number  of  bursts  is  large.  However,  as  prob  decreases,  the  performance  of  the  RLS 
algorithm,  becomes  comparable,  and  of  course,  for  very  small  values  of  prob,  the 
parameter  estimation  error  of  the  RLS  algorithm  would  be  lower  than  that  of  the  OBE 
algorithms.  It  is  surprising  to  see  that,  in  both  the  cases,  the  DHOBE  algorithm  yields 
ellipsoids  of  smaller  volumes  than  the  MVS  algorithm.  Since  the  construction  of  the 
bounding  ellipsoid  Et  from  Et-i  is  different  for  the  MVS  and  DHOBE  algorithms  (c.f 
(2.2.5)  and  (2.3.2)  ),  the  DHOBE  algorithm,can  in  principle,  yield  ellipsoids  with 
smaller  volumes  than  the  minimum  volume  sequential  algorithm. 
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table  2.1 


prob 

RLS 

EWL. 

L=0.99 

S 

L=0.9 

DHOBE 

MVS 

0.05 

(239) 

-14.9 

-14.05 

-7.89 

-26.9 

(4x10-5,  0.35) 

-28.83 

(1.6x10-4.  0.52) 

0.02 

(105) 

-20.37 

-18.14 

-8.6 

-22.67 

(8x10-4,  0.73) 

-21.59 

(3.6x10-3,  1.16) 

Discussion  of  Simulation  Results 

In  the  presence  of  sinusoidal  disturbances,  the  unweighted  RLS  algorithm  yields 
biased  estimates.  The  estimates  of  the  OBE  algorithms  are  also  biased,  however,  the  bias 
is  less  than  the  bias  in  the  unweighted  RLS  algorithm.  The  bias  in  the  OBE  parameter 
estimates  appears  to  go  hand  in  hand  with  an  increase  in  the  volume  of  the  bounding 
ellipsoids.  In  the  impulsive  noise  case,  the  mean-square  error  in  the  OBE  estimates  is 
observed  to  increase  as  the  number  of  impulses  decrease.  This  behavior  is  contradictory 
to  the  behavior  of  the  RLS  estimates  and  deserves  further  investigation. 
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2.6  A  Modification  to  the  MVS  Algorithm 

It  has  been  observed  that  the  MVS  algorithm  does  not  always  yield  a  bounding 
ellipsoid  with  minimum  volume  (the  minimum  over  all  ellipsoids  formulated  according  to 
(2.2.5)  ).  Such  a  situation  can  occur  when  either,  (i)  one  of  the  hyperplanes-  H](t)  or 
H2(t) ,  which  define  St ,  does  not  intersect  the  bounding  ellipsoid  Ei-i;  or  (ii)  both  Hi(t) 
or  H2(t)  do  not  intersect  Ei.i.  In  the  latter  case,  the  smallest  possible  Ef  is  Et-i  itself.  In 
the  former  case,  the  non-intersecting  hyperplane,  can  be  replaced  by  a  parallel 
hyperplane  tangent  to  Ei-i.  Then,  by  appropriately  redefining  the  set  St,  a  minimum 
volume  ellipsoid,  which  bounds  the  intersection  of  Et-i  and  the  new  St  can  be  found, 
which  w'ill  have  a  smaller  volume  than  the  ellipsoid  obtained  by  the  conventional  MVS 
algorithm.  This  is  the  essence  of  the  modification  proposed  by  Belforte  and  Bona 
[Belforte,  1985].  The  modified  algorithm  is  developed  as  follows. 

The  equadons  defining  Hi(t)  and  H2(t)  are 

Hi(t)  =  {9  e  RN  ;  9T<D(t)  =  y(t)  -  Y  ) 
and 

H2(t)  =  {0  G  RN  :  eTcD(t)  =  y(t)  +  Y  ) 

Hi(t)  is  the  lower  hyperplane  and  H2  (t)  is  the  upper  one.  Assume  that  H](t)  does  not 
intersect  Ej.].  Then  the  equation  of  H]'(t)  parallel  to  H](t)  and  tangential  to  Et-i  is 

Hi'(t)=  {ee  RN;  0T<I>(t)  =  zi  ) 

zi  can  be  found  as  follows.  Assume  that  Hi'(t)  intersects  Ei.i  at  Gq-  Since  Hi'(t)  is 
orthogonal  to  the  gradient  vector  a  0o,  for  any  0  g  Hi'(t) 

{^[(e-0(t-l)fp-Vt-l)(9-9{t-l))]}  (0-0o)  =  2(9o-9(t-l))'^p-'(t-l)(0-0o) 

=  0 
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Since  Hi'(t)  is  also  orthogonal  to  0(t)  and  0-0o  lies  in  Hi'(t) 

P'‘(t-l)(9o-e(t-l)  )  =  TiO(t) 

for  some  r),  and  since  6o  is  on  the  boundmy  of  Et-i, 

(00  -  0(t  -  l)f  p-^t  -  l)(0o  -  0(t  -  D)  =  Ti  W(t)P(t  -  l)O(t)  =  1 


Hence  the  projection  of  0o  -  0(t-l)  onto  0(t)  is  of  length 

cD'^(t)(0o-0(t-l))  o'^(t)P(t-ncD(t)  V^^^t)P(t-l)0(t) 

hj  = - j== - =  T) - ,  —  =  - ===== - 

VO’^(t)cD(t)  ^/0^(t)cD(t) 

Vo'‘'(t)0(t) 

where,  as  before, G(t)  =  0^(t)P(t-l)<I>(t).  But,  since  the  equation  of  a  hyperplane  parallel 
to  Hi'(t)  and  passing  through  0(t-l)  is  0T'd)(t)  =  0^(t-l)<I)(t),  therefore 

^  _0'^(t-l)<D(t)-Zi 
Vo'^(t)0(t) 


Thus 


zi  =0T(t-l)O(t)-V^ 

It  can  be  shown  similarly  that  the  equation  of  a  hyperplane  parallel  to  H2(t)  and  tangent  to 
Et-i  is 


H2’(t)  =  (0  €  RN  :  0X0(1)  =Z2=  0T(t-l)<D(t)  +  ^jG(t) 


The  modified  algorithm  thus  consists  of  the  following  steps 


(i)  Evaluate  Z]  =  9T'(t-l)0(t)  -  v'G{t) ,  and  Z2  =  9^(t-  l)O(t)  +  \  G(t) 

(ii)  If  zi  >  y(t)  +  y,  or  Z2  <  y(t)  -  y,  then  the  intersection  of  Si  and  Ei-i  is  void  and 
hence  either  the  model  or  the  noise  bound  is  incorrect. 

(iii)  If  zt  >  y(t)  -  y,  and  zi  <  y(t)  +  y,  then  both  Hi(tj  and  H2(t)  do  not  intersect  E[.i, 
and  hence  Et  =  Ei.i,  is  the  bounding  ellipsoid  with  minimum  volume. 

(iv)  If  z\  <  y(t)  -  y,  and  Z2  >  y(t)  +  y,  then  both  Hi(t)  and  H2(t)  intersect  Et-i,  and 
hence  Ei  is  constructed  as  in  the  MVS  algorithm. 

(v)  If  zi  >  y(t,)  -  y,  and  Z2  >  y(t)  +  y,  then  only  hi  ,t)  does  not  intersect  Ei-i.  Hence 
replace  Hi^tj  by  Hj'(t),  and  construct  the  bounding  ellipsoid  Et. 

(vi)  If  zi  <  y(t)  -  y,  and  Z2  <  y(t)  +  y,  then  only  H2(i)  does  not  intersect  Et.].  Hence 
replace  H2(t)  by  H2’(t),  and  construct  the  bounding  ellipsoid  Ei. 

If  step  (v)  or  (vi),  the  bounding  ellipsoid  is  constructed  as  follows: 


Define 


and 


fi  =  max  [zi,  y(t)  -  y]  ;  f2  =  rnin  [z2  ,  y(t)  +  y  ] 


, . .  fi+h  ,  f2-fi 
=  Y  = 


TTien,  it  is  easy  to  see  that  in  step  (v)  or  (vi),  the  new  set  St'  is  defined  by 


St'  =  { 0  e  RN  :  [y’(t)  -  0T<D(t)]2  <  y  '2 


The  ellipsoid  which  bounds  the  intersection  of  Ei-i  and  St'  can  now  be  constructed 
exactly  as  in  the  MVS  algorithm.  The  update  equations  and  the  equations  for  the 
optimum  updating  gain  are  exactly  as  in  Section  2.2,  except  that  y(t)  is  replaced  by  y  (t) 
and  Y  is  replaced  by  y. 
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The  modified  MVS  algorithm  is  applied  to  the  ARX(2,2)  system  of  Example  2.1  of 
the  previous  section.  The  noise  sequence  is  white  and  uniformly  distributed  in  [-1,+1]. 
Five  Monte  Carlo  runs  (each  of  1000  data  points  )  of  the  MVS  and  modified  MVS 
algorithm  are  performed  The  average  final  volume  of  the  bounding  ellipsoid  is  1.4xl0‘'^. 
for  the  modified  MVS  algorithm  and  4xl0'2,  for  the  MVS  algorithm.  The  average  final 
sum  of  semi-axes  is  0.64  and  2.58  respectively.  Thus  the  modification  can  cause  a 
significant  reduction  in  the  size  of  the  bounding  ellipsoids. 

.-\n  attempt  was  made  to  modify  the  DHOBE  algorithm  in  the  same  manner.  The 
resulting  tests  in  the  modified  algorithm  are  as  above  with  the  only  difference  being  that 
instead  of  using  G(t)  =  0^(1)  P(t- 1 )  ch(t)  to  construct  zj  and  Z2  ,  the  quantity 
G'(t)  =  a^(t-l)tI)^(t)P(t-I)<I)(t)  is  used.  When  simulated,  however,  the  modified  DHOBE 
algonthm  did  not  update  frequently  enough.  This  is  because  if  y  is  small  for  some  time 
instant  t,  then  o^it)  tends  to  decrease  by  a  large  amount.  Then  for  future  time  instants  k. 
when  both  Hi(t)  and  HiCt)  intersect  E  t-i  ,  a2(k-l)  is  much  smaller  than  and  hence  the 
sum  a2(k-l)  +  52(k)  is  much  smaller  than  ‘f- ,  thereby  not  permitting  an  update.  Thus  the 
modification  of  Belforte  and  Bona  does  not  appear  to  be  applicable  to  the  DHOBE 
algorithm. 


CHAPTER  HI 


ARMA  PARAMETER  ESTIMATION 

3.1  Introduction 

In  many  areas  or  signal  processing,  only  samples  of  a  signal  y(ti  are  available,  and  ii 
IS  required  to  obtain  a  model  which  can  describe  the  signal  as  accurately  as  possible.  For 
example,  in  speech  processing,  only  samples  of  speech  are  available,  since  it  is  not 
possible  to  measure  the  glottal  excitation.  In  seismic  data  processing,  often  only  the 
response  of  the  seismic  layers  to  an  excitation  is  measurable,  while  the  actual  probing 
input  is  unknown.  In  radar  and  array  signal  processing,  high  resolution  spectral 
esnmation  is  often  performed  by  first  fitting  a  model  to  the  received  signal. 

.A  powerful  and  increasingly  popular  way  to  model  the  signal  of  interest,  is  tc' 
assume  that  it  is  the  output  of  an  HR  filter  dnven  by  unKnown  white  noise  w(tj 
[Friedlander.  1982].  The  signal  y(t)  is  thus  modeled  as  an  autoregressive  moving 
averaget  ARMA )  process  of  the  form 

yiti  =  u,  y(t-l )+..+  aj,y(t-n!  +  w(t)  +  c,  w(t-l )  -r..+  u,  w(t-r)  (3.1.1  ) 

Fitting  this  ARMA  model  to  the  mea.sured  data  y(t).  t  =1,2...  .  requires  the  estimation  of 
the  parameters  a-,  ...  a^  ,  c,...  c, . 

Even  in  cases  wnen  the  input  is  known,  as  in  control  applications,  the  need  for 
.ARMA.  or  more  accurately  ARMAX  parameter  esnmanon  anses.  For  example  a  DARAlA 
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model  with  input  and  output  measurement  noise  is  equivalent  to  an  ARMAX  model.  This 
can  be  shown  as  follows.  TTe  DARMA  process  [Goodwin,  1984)  can  be  expressed  in 
the  form 

A(q-')y(ti  =  B(q-M  u(t)  (3.1.2) 

where  A(q'’)  =1  -  a,  q-'  -  a2  q'^  -  a^q'",  B(q-')  =  bo  +  b,  q  '  ^  b^  q  -  -  •  +  b^q-^ 

and  q-‘  is  the  delay  operator.  The  input  {u(t))  is  assumed  to  be  measurable.  If  the  inputs 
and  outputs  are  subject  to  white  measurement  noise,  then  the  obserx'ed  outputs  >'^(1)  and 
observed  inputs  u^^lt)  are  given  by 

y^(t)  =  y(t)  +  pit)  (3.1.3a) 

u^(t)  =  u(t)  +  s(t)  (3. 1.3  b  I 

where  p(t)  and  s(t)  are  assumed  to  be  zero  mean,  uncorrelated  white  noise  processes  with 
variances  and  respectively.  Substituting  (3.1.3)  in  (3.1.2)  gives 

A(q-')  >'„,(!)  =  B(q-')  Un,  (t)  +  A(q->)  p(t)  -  B(q-')  s(t)  (3.1.4) 

By  the  spectral  factonzation  theorem  [Astrom.  1986],  the  noise  terms  in  (3.1.4)  can  be 
replaced  by  a  unique  spectrally  equivalent  minimum  phase  process,  so  that  (3.1.4) 
becomes  an  ARMAX  process 

A(q-‘)yr^([)  =  B(q->)  u^  (t)  +  D(q->)  w(t)  i3.1.5) 

where 

D(  z)D(  z-‘  I  =  A(  z)A(  z-‘)  -  0^2  B(  z)B(  z  ’ )  (3.1 .6 ) 

The  DARMy\  process  with  input  and  output  measurement  noise  is  thus  a  special  case  of 
an  ARMAX  process. 

Many  methods  for  the  estimation  of  ARMA  parameters  have  been  proposed  in  the 
literature,  particularly  from  the  spectral  estimation  viewpoint.  Among  the  more  recent  are 
Cadzow's  overdetermined  rational  equation  method  fCadzow,  1982),  the  spectral 
matching  technique  of  Friedlander  and  Porat  fFriedlander.  1984],  and  the  extended 
Yule-Walker  method  of  Kaveh  [Kaveh.  1979).  A  common  feature  of  these  methods  is  the 


use  of  the  sample  autocorrelation  sequence  of  the  output  process  y(t).  In  the  context  of 
system  identification,  the  extended  least-squares  (ELS),  the  recursive  maximum 
likelihood  (RML)  and  multi-stage  least-squares  algorithms  have  been  used  to  recursively 
estimate  ARNIA  parameters  [Ljung,  1983,  Mayne,  1982].  The  ELS  algorithm  uses  the 
posteriori  prediction  error  e(t),  as  an  estimate  of  w(t).  The  regressor  vector  is  formed 
from  y(t-l),..,  y(t-n)  and  e(t-l),..,  e(t-r).  The  standard  RLS  algorithm  is  then  employed 
to  update  the  estimates.  The  algorithm  is  conceptually  simple  but  restrictive  in  the  sense 
that  convergence  of  the  algorithm  can  be  assured  only  if  the  underlying  transfer  function 
H(q  ‘  I  =  1/  C(q  ‘  )  -  1/2  is  stnctly  positive  real  (SPR).  with  q-‘  being  the  delay  operator 
and 

C(q'',)  =1-1-  C]  q'^  -t-  C2  q'^  -h  ..  -i-  c^  q-f  (3.1.7 ) 

The  transfer  funcuon  H(q-‘ )  is  SPR  if  there  exists  an  £  such  that 

Re[H(eJ“)]  >  £  >  0,  for  all  co  €  [-Tc.Ttj 

The  RML  algorithm,  which  uses  a  filtered  version  of  the  regressor  vector  used  in  the 
ELS  algorithm,  does  not  require  H(q-’  )  to  be  SPR.  However  the  estimates  have  to  be 
monitored  and  projected  into  a  stability  region  to  ensure  convergence[Ljung.  1983]. 

In  this  chapter,  the  DHOBE  algonthm  is  extended  to  the  ARMA  case.  For  the  ARNLA 
parameter  estimation  problem,  since  the  input  sequence  {w(t))  is  unobservable,  the 
DHOBE  algorithm  cannot  be  applied  in  its  present  form.  However,  by  assuming  that  the 
input  white  noise  is  bounded  in  magnitude,  the  DHOBE  algorithm  can  be  extended  in  a 
manner  similar  to  the  ELS  algorithm.  The  simpler  optimization  critenon  of  the  DHOBE 
alcorithm  makes  the  convergence  analysis  of  the  extended  algorithm  tractable.  The 
analvsis  is  penormed  by  imposing  a  bound  on  the  sum  of  the  magnitudes  of  the  M.A 
coefficients.  This  is  sufficient  to  ensure  that  the  true  parameter  vector  is  contained  in  all 
the  optimal  bounding  ellipsoids.  The  algorithm  thus  gives  l(X)9c  confidence  regions  for 
the  parameters  at  every  instant.  The  ELS  or  RML  algorithms,  in  contrast  can  yield 


37 


confidence  regions  only  asymptotically.  A  uniform  bound  on  the  a  posteriori  prediction 
error  of  the  extended  DHOBE  algorithm  is  derived.  In  contrast,  even  though  the  a 
posteriori  prediction  errors  are  generated  in  a  stable  fashion  in  the  ELS  algorithm[Ljung. 
1983].  it  is  difficult  to  obtain  an  expression  for  even  the  asymptotic  bound,  if  such  a 
bound  exists.  By  imposing  a  persistence  of  excitation  condition  on  the  regressor  vector, 
the  a  priori  prediction  error  of  the  extended  DHOBE  algorithm  is  shown  to  be  bounded 
and  the  parameter  estimates  are  shown  to  converge  to  a  neighborhood  of  the  true 
parameter  vector.  Simulations  show  that  the  performance  of  the  algorithm  is  comparable 
to  the  ELS  algorithm  as  far  as  the  mean-squared  parameter  estimation  error  fMSPEE)  is 
concerned.  It  has  also  been  observed  ,  in  a  number  of  examples,  that  the  transient 
performance  of  the  extended  algorithm  is  superior  to  the  ELS  algorithm. 

3.2  Extension  to  ARMA  Parameter  Estimation 

The  ARMA  model  described  by  (3.1.1),  can  be  rewritten  as 

w(t)  =  y(t)  -  (3.2.1 ) 

where  0*.  the  vector  of  true  parameters,  and  0'(t)  are  defined  by 

0*  =  [a,  ,  an  ...  a^  ,  c,  ,  C2  ....  c^  F 

0'(t}  =  [  y(t-l)  ,  ..,  y(t-n)  ,  w(t-l)  ,  ....w(t-r)  ]  ^ 

Here  again.  w(t)  is  assumed  to  be  bounded  in  magnitude,  i.e.  there  exists  positive  y,, 
such  that 

I  w(t)  I  <  Yo  (3.2.2 ) 

Since  the  values  of  the  noise  sequence  {w(t)  )  are  not  available,  the  regressor  vector 
0’(t)  is  not  known  exactly.  If,  however,  at  time  t,  an  estimate  of  0*. 

0(t)  =  [  a,(t).  ..,  Oj,  (t)  c,  (t),  ...c^  (i)  (3.2.3) 

is  available.  w(ti  could  be  estimated  by  the  a  posteriori  prediction  error!  also  called  the 


38 


residual  error  by  some  authors  ) 

E(t)  =  y(t)  -  e^COOCt)  G.2.4) 

where 

0(t)  =ly(t-l),  ..y(t-n),e(M),  ...  e(t-r)  ]  t  (3.2.5) 

Now  just  as  in  the  ARX  case,  define  for  some  suitable  y  the  set 
{  0  :  (  y(t)  -  eTOCt)  )2  <  y2, 6  e  R  ) 
and  the  bounding  ellipsoid 

E,=  {  e€R"^^•  (6-0(1)/  P''(t)(  0-0(0)  <  alt)  }  (3.2.6) 

The  update  equations  for  6(t),  P(t)  and  a2(t)  are  as  in  Section  2.3.  with  the  only 
difference  being  that  the  regressor  vector  is  now  given  by  (3.2.5). 

As  in  the  OBE  algorithm,  the  bounding  ellipsoids  are  optimized  by  choosing  \  to 
minimize  a^lt).  In  order  to  facilitate  the  subsequent  analysis,  the  initial  conditions  are 
modified  to 

P(0)  =  M  ’  9  (0)  =  0,  and  alO)  <  y  2  (3.2.7 ) 

where  M»1 .  and  is  the  identity  matrix  of  dimension  n+r. 

This  choice  of  initial  conditions  ensures  that  the  initial  ellipsoid  Eq  will  contain  the 
true  parameter  vector  0*  and  more  imponantly.  as  shown  in  Appendix  3.^.  simplifies 
the  optimum  forgetting  factor  formula  to 

If  a2(t-l )  +  52(t)  <  y2  then  =  0,  (3.2.8) 


otherwise 


ifG(t)  =  1 


(3.2.9a) 


l-P(t) 

1 


X 


* 

t 


=  < 


1 


[i- 


G(t) 

(i(t)(G(t)-l) 


where 


Y~-  O^(t-l) 

5^(t) 


ifG(t)^l  (3.2.9b) 


(3.2.9c) 


Remarks 


(1)  It  IS  shown  in  Appendix  3A  that  if  csHi-l)  +  52(t)  >  then  given  by  (3.2.9) 
satisfies 

=0 

and  funhermore  0<  /v,*  <1.  Thus  unlike  the  case  in  Section  2.3,  no  upper  bound  need  be 
imposed  on  the  forgetting  factor. 

(2)  Since  a^it)  =  o^it-l)  if  =  0,  any  non-zero  value  of  which  minimizes  a^(t). 
will  cause  a^it)  <  a^lt-l).  Thus  choosing  X*  to  minimize  a2(t).  causes  (a2(t)}to  be  a 
non-increasing  sequence. 

The  recursive  relations  (2.3.5)  -  (2.3.7),  the  initial  conditions  (3.2.7).  the  selective 
update  strategy  (3.2.8  )  and  the  forgetting  factor  determination  formula  (3.2.9)  form  the 
Extended  Optimal  Bounding  Ellipsoid  (EOBE)  estimation  algorithm.  The  choice  of  the 
threshold  'f-  will  become  clear  from  the  analysis  below'.  The  algorithm  retains  the 
discerning  update  strategy  and  the  modular  adaptive  filter  structure  of  the  DHOBE 
algorithm. 
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3.3  Convergence  Analysis  of  the  KOBE  Algorithm 

The  main  difficulty  in  the  analysis  of  the  EOBE  algorithm  arises  from  the  presence  of 
the  a  posteriori  prediction  errors  in  the  regressor  vector.  Unlike  the  OBE  algorithm,  in 
this  case,  boundedness  of  w(t)  does  not  guarantee  that  all  the  sets  S; .  t=1.2..  will  contain 
0*.  The  first  step  in  the  analysis  is  to  find  conditions  under  which  this  happens.  The 
minimization  of  oHi),  at  every  time  instant,  and  the  choice  of  initial  conditions  (3.2.7), 
facilitate  the  characterization  of  the  behavior  of  the  a  posteriori  prediction  errors. 


Lemma  3.1.  For  the  EOBE  algonthm.  if  G^(t-l)  +  5^(t)  >  y-.  i.e..  it  an  update 
occurs  at  time  instant  t.  then 

(i)  c2(t)  +  e2(t)  =  -f,  (3.3.1 ) 

(ii)  e^fk)  <  e2(t)  for  all  time  instants  k  <  t,  (3.3.2) 

and  if  t+j  is  the  time  instant  at  which  the  next  update  occurs  then 

(iii)  e2(k)  <  e2(t)forall  k  <  t+j.  (3.3.3) 

Proof; 


(it  It  has  been  shown  in  Appendix  3A.  that  if  a2(t-l )  +  62(t)  >  y-then.  the  optimum 
forgetnng  factor  satisfies 


d  oTt) 


dA  A^  =  A 


=  0 


(3.3.4) 


Taking  the  derivative  in  (2.3.5)  and  using  (3.3.4)  yields 
,  .  (l-?.)5^(t)  X6^(t)G(t) 

y'-CT^lt-l) - ^ -  - - i (3.3.5a,) 

l-A+X^G(t)  (1-A+ >.G(t)  )“ 


which  can  be  rewritten  in  the  form 

n  (1  -  -  A.j'Gd) 

■f-o‘'(t-l)=  5"(t) - 1 - - 

(1  -A  +  ?L^G(t)  )- 


(3.3.5b) 
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In  (3.3.5)  and  in  the  remainder  of  the  chapter,  when  there  is  no  risk  of  confusion,  the 
optimum  forgening  factor  will  be  denoted  by  X^.  The  a  posteriori  prediction  error 
e(t)  =  y(t)  -  eT(t)C)(t)  =  y(t)  -  0  T(t-l)d)(t)  -  \  O^lt)  P(t)0(t)  6(t) 

Thus 

e(t)  =  [l-X,0'i'(t)  P(t)0(t)]  5(t) 

Multiplying  P(t)  in  (23  J)  on  the  left  by  ^^(t)  and  right  by  0(t)  yields 

T  l-X 

1-X  O  (t)P(t)<D(t)  - - ^ - 

‘  l-X+X^G(t) 

Thus 

1  -  A 

e(t)  =  - i -  5(t)  (3.3.6) 

1  -  X +  X  G(t) 

Note  that  the  non-negativeness  of  G(t)  implies  that  eHi)  <  52(t).  Substituting  (3.3.6)  in 

(3.3.5b)  and  rearranging  terms  yields 

,  o  .  XjG(t)E^t)  (3.3.7) 

( 1  -Xj)  y‘-  ( 1  -\)  a‘(t- 1 )  =  ( 1  -X^)  e"(t)  - - - - 


Now  using  (3.3.6)  in  (2.3.5)  gives 

-I  ^  -V  1  X”G(t)  -) 

a“(t)  =  (1- X ,)  alt-l)  +  X,y*--XE'(t) - E'(t)  (3.3.8) 

1-  X^ 

Finally,  subtracting  (3.3.8)  from  (3.3.7)  gives  (3.3.1). 


(ii)  Case  1  If  k  <  t,  is  an  updating  instant.  Then  (3.3.1)  gives 

a2(k) +  E2(k)  =  y2  (3.3.9) 

But  since  {a2(t)}  is  a  non-increasing  sequence,  (3.3.9)  and  (3.3.1 )  together  would  imply 
that 

E2(k)  <  £2(1) 

Case  2  If  k  <  t,  is  a  non-updating  instant,  then  E2(k)  =  52(k)  and  so  by  (2.3.8). 
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a2(k-l )  +  e2(k)  <  y2.  and  since  a2(t)  is  non-increasing,  e2(k)  <  e2(t). 

(iii)  Since  ,  k  =  t+1.  are  ail  zero,  o2(k)  =  a^l),  for  all  t  <  k  <  t-i-j.  And 

because  k  is  a  non-updating  instant,  o2(k-l  )  -i-  e^fk)  =  +  e^ik)  <  y-  ,  and  so 

(3.3.3)  follows. 


Remark 

Lemma  3.1  shows  that  the  the  sequence  of  squared  residual  errors  evaluated  at  the 
updating  instants  is  non-decreasing.  Furthermore,  the  squared  residual  error  at  any 
non-updating  instant  is  not  greater  than  the  squared  residual  errors  at  the  updanng  instant.s 
immediately  pnor  and  after  the  non-updating  instant.  This  characterization  of  the  squared 
residual  errors  is  useful  in  deriving  sufficient  conditions  under  which  the  convex 
polytopes  St  and  Ej  will  contain  0*. 


Theorem  3.1. The  sets  S;  and  consequently  the  ellipsoids  Ej.  t  =  1.2...  will  contain  the 
true  parameter,  if 

(i)  Eo  contains  6*  .  (3.3.10a) 


(ii)  The  true  moving  average  coefficients  satisfy 


1=1 


(iii) The  threshold  y  satisfies 


i-i'c.i 


i  =  l 

Proof  Let  the  induction  hypothesis  be  0*  e  Et.,.  Then  defining 


V(t)  =  (e(t)-e^  )T  P-i(t)  (0(0-0*) 


(3.3. 10b ) 


(3.3.10c) 


(3.3.11) 


and  recalling  the  definition  of  E[.i  yields 
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V(t-1)<  c2(t-l)  (3.3.12) 

and  since  P''(t)  is  positive  definite  for  all  t.  a2(t-l)  >  0.  Now  using  (3.2.1)  and  (3.2.5) 
yields 

y(t)  -  (l)(t)  =  C(q-i)[w(t)] -(C(q-i)- l)[E(t)]  (3.3.13) 

where  the  operator  C(q'^)  has  been  defined  in  (3.1.7).  Defining  n(t)  =  C(q-i)[w(t)]  then 
yields 

iy(t)  -  0*'r<l>(t)l<  I  n(t)  I  +  I  c,  e(t-l)  +  C2  £(t-2)...+ c^Elt-r)  I 
But 


1  n(t)  I  <  y  ,  for  all  t 

where 

r 

Y'  =  YoH+^Ic.I)  (3.3.14) 

i=l 


Hence 

ly(t)  -  e*T  0(t)  I  <  y  +  ( ICjl  te(t-l)l  +  Icol  lE(t-2)l+  ...+  Ic^l  lE(t-r)l  )  (3.3.15  ) 

But  by  Lemma  3.1,  if  t-j  is  the  updating  instant  immediately  preceding  time  instant  t.  then 

le(t-i)l  <  le(t-j)l  for  1<  i  <  r 

Thus 

lyd)-©*^  0(t)l  <  y'  -  Ie( t  -  j)l^lc,l  (3.3. 16i 

1=1 

Now  £2(t-j)  =  Y^  -  a2(t-j)  =  Y‘  -  o2(t-i).  By  the  induction  hypothesis.  (j2(t-l)  >  0, 
Hence 


•  I  e(t-j)  I  <  Y 

and  so 

r 

ly(t)-0  0(t)l  <  Y  + 

\=i 

So  S(  will  contain  0*  if 

r 

r  +  -  y 

i=l 


(3.3.17) 
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The  inequality  (3.3.17)  will  hold  iff  (3.3.10b)  and  (3.3.10c)  are  true.  Assuming 

(3.3.10b)  and  (3.3.10c)  thus  guarantees  that  for  all  time  instants  t 

(y(t)  -  e*TcD(t))2<  f  (3.3.18) 

Using  (2.3.6)  and  (2.3.5),  it  is  easy  to  show  that 

V(t)  =  (l-?.,)V(t-l)  +  >.t(y(t)-e*'^cI>(t)f-MzAi}^  (3.3.19) 

^  ’  l'A[+XtG(t) 

Now  using  (2.3.5)  in  the  above  equation  yields 

V(t)-a2(t)  <  (1- )  (  V(t-D  -  a2(t-l) )  +  X.  [  (y(t)  -  oit) )-- y2] 

and  so  from  (3.3.18)  it  follows  that 

V(t)-a2(t)  <  (1- )  (  V(t-1 )  -  a^Ct-l)  )  i3.3.20i 

Finally,  by  (3.3.12),  it  follows  that 

V(t)-o2(t)<0  (3.3.21) 

i.e.,  E,  contains  6*.  and  o2(t)  is  non-negative  for  all  t. 

Remarks 

(1)  The  condition  (3.3.10b)  implies  that  the  noise  sequence  n(t)  =  C(q’’)(w’(t)]  should 
not  be  '  too  colored  ”.  It  is  interesting  to  see  how  this  this  condition  relates  to  similar 
conditions  which  appear  in  the  convergence  analysis  of  other  parameter  estimation 
algorithms.  .A  sufficient  condition  for  convergence  of  the  ELS  algonihm  is  that  the 
transfer  function  [1/C(q'')  -  1/2]  be  SPR.  It  is  shown  in  Appendix  3B  that,  for  this 
transfer  function  to  be  SPR.  it  is  necessary  that 

r<l  (3.3.22) 

i=l 

Thus  if  (3.3.22)  is  not  satisfied,  the  transfer  function  is  not  SPR  and  so  the  ELS 
algorithm  cannot  be  guaranteed  to  converge.  Coincidentally  enough,  the  condition 
(3.3.10b)  is  identical  to  the  Strictly  Dominant  Passive  (SDP)  condition  [Dasgupta.  1987) 
which  appears  in  the  analysis  of  some  signed  LMS  algorithms.  In  Appendix  3B.  it  is  also 
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shown  that  if  (3.3.10b)  holds  then  fl/C(q‘*)  -  1/2]  is  SPR.  However,  the  converse  is  not 
true.  For  example  if  C(  q-')  =  1  +  0.8  q-‘  +  0.15  q-^  -  0.15  q-2  then  f  1/C(q-M  -  1/2]  is 
SPR.  but  C(  q'^)  does  not  satisfy  (3.3.10b).  Thus  this  condition  is  more  restrictive  than 
the  SPR  condition. 

(2)  Selection  of  the  right  "  noise  bound"  "f-  is  made  possible  by  (3.3.10c).  The  user 
would,  however,  need  to  have  some  knowleoge  of  the  magnitude  of  the  true  moving 
average  coefficients.  It  is  interesting  to  see  that  as  the  moving  average  filtering  of  { w(t) } 
increases  (Zlcjl  increases),  the  bound  "f-  is  required  to  increase,  to  guarantee  that  the  true 
parameter  is  contained  in  all  the  sets  Sjand  ellipsoids  Ej  .  Simulation  results  show  that 
overesnmation  of  'f-  has  very  little  effect  on  the  parameter  estimates  (centers  of  the 
bounding  ellipsoids),  though  it  may  have  an  adverse  effect  on  the  size  of  the  bounding 
ellipsoids. 

(3)  The  theorem  shows  that  V(t)  <  a^ft)  for  all  t.  Since  0^(0)  <  y-  .  and  {a2(t)}is  non¬ 
increasing.  the  theorem  provides  instantaneous  bounds  on  the  normalized  parameter 
esnmation  error  V(t). 

(4)  The  conditions  (3.3. 10b, 3. 3. 10c)  are  not  necessary  conditions  and  the  algorithm  has 
been  observed  to  perform  well  in  several  examples  where  these  conditions  were  violated. 

The  following  result  follows  straightforwardly  from  Lemma  3. 1  and  Theorerri  3. 1 . 

Corollary  3.1.  If  the  conditions  of  Theorem  3.1  hold  then 

(a)  lim  e"(t  )  exists  (3.3.23a) 

I  — >oo 

where  j  1. 1  is  the  sub.sequence  of  updating  instants  of  the  EOBE  algorithm,  and 
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(b)  Uniformly  bounded  a  posteriori  prediction  errors 

eHi)  <  Y^.for  all  time  instants  t  (3.3.23b) 

Boundedness  of  52(t),  the  a  priori  prediction  error,  and  convergence  of  the 
parameter  estimates  to  a  neighborhood  of  the  true  parameter  can  be  assured,  by  requiring 
the  regressor  vector  to  be  persistently  exciting.  The  next  lemma  relates  the  positive 
definiteness  of  P''(t)  to  the  richness  of  the  regressor  vector  Olt). 

Lemma  3.2.  If  there  exist  positive  a3  and  N  such  that,  for  all  t 

t*  N 

^  0(i)0'’^(i)  >0.1  >0  (3.3.24a) 

i=t 

then  there  exists  a  positive  a4  such  that 

P-i(t)  >04!  >0  (3.3.24b) 

Proof  of  the  lemma  is  the  same  as  that  of  Theorem  4.1  of  fDasgupta.  1987].  it  is  thus 
omitted  here. 

Remark.  The  positive  definiteness  of  P'*(t)  implies  that  the  eigenvalues  of  P(t)  are  upper 
bounded. 

Theorem  3.2.  If  the  assumptions  of  Theorem  3.1  are  satisfied  and  (3.3.24a)  holds  then 
the  EOBE  algorithm  ensures  : 

(a  )  Parameter  difference  convergence 

lim  II  0(0  -  e(t-k)  II  =  0  (3.3.25) 

I  — ^ 

for  any  finite  k. 

(b)  If.  in  addition,  the  process  (1.1)  is  stable  then  the  algonthm  yields  asymptotically 
bounded  a  prion  prediction  errors 
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5“(t)  (  0,  y'i  (3.3.26) 

(c)  If  the  moving  average  coefficients  are  further  restricted  to  satisfy 

^  2 

[  2,  I  c.  I  ]  <  0.5 

1=1 

then  an  asymptotic  bound  on  the  parameter  estimation  error  can  be  obtained 

Il0(t)-e*  II  =  [  0.  2  Y‘  (l+I  |c.  I  )^  /  aj  (3.3.27) 

where  Yr  ^ind  are  as  in  (3.2.2)  and  (3.3.24b)  respectively. 

Proof. 

(a)  From  (2.3.6)  and  (2.3.7 ) 

.  /-“  0^(t)  P“(t-1)  cpd)  S'iti 

110(0-0(1-1)11"  =  — - - -  (3.3.28) 

(  1-  -I-  G(t) ) 

A“  G(t)5“(t) 

^  ^ T  ‘3.3.29, 

(1  -  A  •+  Aj  G(t)  )■ 

where  e^a^(P(t-l  )|  is  the  maximum  eigenvalue  of  P(t-l).  and  11.11  denotes  the  euclidean 
norm.  To  see  how  (3.3.29)  follows  from  (3.3.28).  consider  any  positive  semi-definite 
symmemc  matnx  A  and  vector  .t.  Then 

Ax  =  A  Ax  =  y‘  Ay 

where  y  =  Ax.  But 

.v'Av>e„„M-‘|/y 

where  refers  to  the  minimum  eigenvalue.  Since  ejj,^(4)  =  1/  en,,j,(.-V'  i.  hence 

.'■^.'■^e^3^|.4).x^Ax 

x^4-a:  <en,a^(/4|.x^  Ax 


and  thus 
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Now  multiplying  both  sides  of  (3.3.5a)  by  A; ,  and  using  (2.3.5)  yields 


o“(t)  =  o“(t-l 


52(t)G(t) 

(  1  -  \  +  G(t)  )* 


(3.3.30) 


The  non-negativity  of  (t)  therefore  implies 


I 

I 


5-(i)G(i) 

(  1  -  A.  +  A.  G(i) )‘ 

1  1 


0^(0)-  a"(t) 


<  oo 


(3.3.31) 


Hence 

//  5"(t)G(t) 

urn  - ^ =  0  (3.3.32) 

l_  oe  (  1  -  A  -r  A  G(t  )  )' 

i  { 

If  (3.3.24a  )  holds  then  by  Lemma  3.2.  enjaj({P(t-l),  the  maximum  eigenvalue  of  P(t-l  ). 
is  bounded  for  all  t .  and  hence  (3.3.29)  and  (3.3.32)  yield 


II  e(t)-0(t-l)ll  ->0  (3.3.33) 

Applying  the  Minkowski  inequality  to  li0(t)-0(t-k)ii  and  using  (3.3.33)  completes  the 
proof  of  (3.3.25). 


(b)  Stability  of  the  process  (3.1.1 )  and  the  boundedness  of  w(t)  implies  that  the  outputs 
y(t)  are  bounded.  Hence  from  (2.3.16)  (3.3.23b).  and  Lemma  3.2.  it  follows  that 
G(t)  <  {P(t-l)  )[  r  y" -f-  n  max  yTi)  J  <  «= 

l-n  <  i<  l- 1 

where  n  is  the  order  of  the  AR  process  and  r  is  the  order  of  the  MA  process.  It  can  now 
be  shown,  just  as  in  Theorem  3.2  of  fDasgupta.  1987].  that  the  a  prion  prediction  errors 
satisfy  (3.3.26). 


(y(t)-0*^  0(t))"  <2'{^+  2 


L>=1 


E“(t- j) 


(c)  From  (3.3.16) 
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Assuming 

r 

[  X's'  r  <  0-5 

1=1 

then  yields 

(y(t)-e*'^<D(t))  <2y'-+ E"(t- j) 

Substituting  in  (3.3.19)  and  using  (3.3.6)  gives 

V(t)  =  (1  - >L[)V(t  -  1)  +  >L,  2Y’‘+e‘(t- j,)-£^(t) - —  £“(t)  (3.3.341 

L  . 

From  Lemma  3.1.  £^(t-j)  <  E^lt),  where  t-j  is  the  updating  instant  prior  to  time  instant  t. 
Thus 

V(t)-2Y2  <  (1-A,)[  V(t-1)-2Y'2] 

For  large  enough  t.  the  term  on  the  right  hand  side  goes  to  zero  and  hence  for  large 
enough  t 

V(t)  =  (0(t)-0*  )'^P-‘(t)  (6(0-0*)  <  2y2 
And  so  (3.3.27)  follows  from  Lemma  3.2  and  (3.3.14). 

Remarks: 

( 1)  The  results  of  Theorem  3.1.  and  the  results  (3.3.25).  (3.3.26)  of  Theorem  3.2.  do 
not  require  the  process  to  be  stable  (A(q''i  =  1  -  Uj  q"'  -  q  -  -...  -  a^,  q-^,  to  be 
Hurwitz).  However  if  the  process  is  unstable,  then  the  a  priori  prediction  errors  will 
become  very  large,  thereby  causing  overflows.  In  addition,  on  account  of  finite  precision 
effects,  the  matrix  P(t)  may  not  stay  positive  definite,  thus  invalidating  the  notion  of 
bounding  ellipsoids  and  causing  the  algorithm  to  fail.  In  this  situation,  the  ELS  algorithm 
will  fail.  too. 

(2)  Theorems  3.1  and  3.2.  do  not  impose  any  statistical  properties  on  the  input 
sequence!  wit) } .  However  our  simulation  expenence  has  been  that  the  parameter 
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estimates  are  biased  if  the  input  is  not  white.  Of  course,  such  is  also  the  case,  for  the  ELS 
algorithm.  The  EOBE  algorithm  uses  the  boundedness  propeny  of  the  inputs  to  construct 
confidence  regions  (ellipsoids)  for  the  parameters,  irrespective  of  the  color  of  the  inputs. 
This  feature  is  absent  in  any  of  the  existing  ARMA  parameter  estimation  algorithms. 

(3)  The  upper  bound  given  by  (3.3.27)  is  usually  looser  than  the  bound 
li  0  (t)  -  0*  IP  <  a2(t)  /  a4,  yielded  by  (3.3.21). 

3.4  Simulation  Studies 

Simulations  have  been  performed  to  investigate  the  performance  of  the  EOBE 
algorithm  vis  a  vis  the  ELS  algorithm.  In  this  paper,  we  present  simulation  results  for 
three  examples-  a  broad  band  ARMA  (3,3)  process,  a  narrow  band  ARMA(2,2;  process 
and  an  ARMAX(3,3,2)  process. 

Example  3.1  Broad  band  ARMA  (3,3)  process 
The  output  data  {y(t))  is  generated  by  the  following  difference  equation 
y(t)  =  -  0.4  y(t-l )  -I-  0.2  y(t-2)  +  0.6  y(t-3)  +  w(t)  -  0.22  w(t-l ) 

+  0.17  w(t-2)-0.1  w(t-3) 

The  noise  sequence  {w(t)}  is  generated  by  a  pseudo-random  number  generator  with  a 

uniform  probability  distribution  in  [-1,1].  The  upper  bound  was  set  equal  to  25.0.  and 

o2(0)  =  y  -  -0. 1 .  The  parameter  estimates  were  obtained  by  applying  the  EOBE  algorithm 

to  1000  point  data  sequences.  Twenty  five  runs  of  the  algorithm  were  performed  on  the 

same  model  but  with  different  input  noise  sequences.  The  average  squared  parameter 

error  L,(t  ).  is  computed  for  the  AR  coefficients  according  to  the  formula 

25 

"  J  =  1 

where  1,  (t),  the  squared  AR  parameter  error  at  time  t  for  the  j'th  run.  is  defined  by 
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i  =  l 


(a.(t)  -  a.  r 


with  aj  and  aj(t)  being  defined  by  (1.1)  and  (3.3),  respectively.  The  average  squared 
parameter  error  L2(t)  for  the  MA  coefficients  is  defined  analogously.  Fig.  3.1  and  Fig. 
3.2  display  the  average  squared  estimation  errors  for  the  AR  and  MA  parameters  using 
both  the  EOBE  and  the  ELS  algorithms.  The  curves  show  that  the  performance  of  the  two 
algorithms  is  comparable.  The  average  number  of  updates  for  the  EOBE  algorithm  was 
160  for  1000  point  data  sequences.  Thus  only  16%  of  the  samples  are  used  for  updates, 
as  compared  to  the  ELS  algorithm  which  updates  at  every  sampling  instant. 


The  effect  of  different  choices  for  the  upper  bound  'f-  on  the  performance  has  also 
been  studied.  For  each  value  of  the  asymptotic  mean-squared  parameter  estimation 
error  (MSPEE),  was  computed  over  25  runs  of  the  algorithm,  according  to  the  formula 

25 

VBPEE=  — ^11  6  (1000)  -  0*  II  ‘ 

j=i  ^ 

where  0j(lOO())  is  the  parameter  estimate  at  the  lOOO'th  iteration,  in  the  j'th  run.  The 
lower  bound  on  as  calculated  from  (3.3.10c)  is  y2  >  8.54.  The  second  column  of 
Table  3.1  lists  the  different  values  of  MSPEE  obtained  when  y^  is  vaned  from  0.5  to 
100.  In  each  case.  oMO)  =  y^  -  0. 1.  The  third  column  lists  the  average  number  of 
updates.  The  founh  and  fifth  columns  enumerate  the  average  number  of  times  the  true 
parameter  is  not  contained  in  Sj  and  E,  respectively.  The  last  two  columns  provide 
measures  of  the  average  final  volume  and  average  final  sum  of  setTu-a.\es  respectivelv.  It 
is  clear  that  the  centers  of  the  bounding  ellipsoids  are  insensitive  to  the  value  of  y^.  since 
the  tap  error  is  almost  constant,  though  the  final  size  of  the  ellipsoids  is  very  sensitive  to 
y-.  This  can  be  explained  as  follows. 


The  update  equations  (2.3.6)  and  (2.3.4)  for  the  parameter  estimates  and  the  matrix 
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P-'  ,  can  be  affected  by  only  through  the  forgetting  factor  From  (3.2.8),  it  is  clear 
that  the  update  decision  depends  on  whether  52(t)  is  greater  than  or  less  than  [y-  - 
a^lt-l)].  If  52(t)  is  greater  than  [y^  -  o2(t-l)]  then,  from  (3.2.9),  the  calculated  value  of 
/.j  again  depends  on  y^  only  through  [y-  -  a2(t-l)].  But  from  (2.3.5),  y-  -o2(t)  depends 
on  y^  only  through  the  quantities  y-  -  a2(t-l),  y^  -  aHt-2), ..,  y~  -  0^(0).  So  for  all  time 
instants  t,  y^  -a^d)  is  a  complicated  function  of  the  data,  0(0),  P(0),  and  [y-  -  0^(0)] . 
and  consequently,  P(t)  and  0(t)  depend  on  y^  only  through  [y-  -  a2(0)].  Thus,  since 
this  difference  is  constant  in  the  above  simulations,  the  parameter  estimates  are  unaffected 
by  the  value  of  y-.  However,  the  final  size  of  the  ellipsoids  depends  on  a2(t) ,  and  since 
y-  -a-(lOOO)  is  constant  for  all  values  of  y^,  the  final  size  is  proponional  to  y-.  When  y- 
=  0.5,  a'^(lOOO)  is  negative  and  hence  the  final  ellipsoids  do  not  exist. 


TABLE  3.1 


v2 

T 

(dB) 

Avg. 

updts 

Avg. 

0’«E  Sj 

Avg. 
O’g'  E, 

Avg.  final 
volume 

Avg.  final  sum 
of  semi-axes 

0.5 

-15.34 

156 

292 

949 

1.0 

-15.34 

156 

13 

0 

0.22 

10.46 

2.0 

-15.36 

156 

0 

0 

2.6x10^ 

74.08 

5.0 

-15.34 

156 

0 

0 

5.4x10’ 

264.91 

25.0 

-15.35 

156 

0 

0 

2.1x101- 

1536.92 

100.0 

-15.39 

156 

0 

0 

l.OxlO‘6 

6303.29 

The  performance  of  the  algorithm,  when  the  noise  sequence  {w(t))has  a  gaussian 


distnbution.  was  evaluated  in  a  similar  fashion.  A  constant  value  of  'f  =25  was  used  and 


53 

the  standard  deviation  of  the  input  was  varied.  The  results  for  25  runs  of  the  algorithm 
are  shown  in  Table  3.2.  It  is  clear  that  the  unbounded  noise  has  marginal  effect  on  the 
parameter  estimates. 


TABLE  3.2 


S.D 

of  input 

T 

(dB) 

Avg. 

updts 

Avg. 

S, 

Avg. 

Avg.  final 
volume 

Avg.  final  sum 
of  semi-axes 

0.5 

-4.9 

90 

0 

0 

5.6x101- 

1574.6 

1.0 

-5.95 

105 

0 

0 

4.89x108 

333.4 

2.0 

-5.98 

114 

12 

26 

1.47x102 

19.5 

5.0 

-6.29 

119 

323 

965 

- 

- 

The  tracking  capability  of  the  EOBE  algorithm  was  compared  with  that  of  the  ELS 
algorithm  (with  forgetting  factor=0.99).  The  same  model  was  used  to  generate  400  data 
points.  I  he  parameters  were  then  changed  by  150%  and  the  next  400  points  were 
generated.  Finally  the  last  200  points  were  generated  by  using  the  original  parameters. 
The  average  squared  parameter  error  was  evaluated  over  25  runs  and  is  shown  in  Figure 
3.3.  Even  though  the  formulation  of  bounding  ellipsoids  is  based  on  the  assumption  that 
the  parameters  are  constant,  the  simulation  results  show  that  the  algorithm  is  able  to 
accommodate  changes  in  model  parameters.  Analysis  of  the  tracking  ability  of  the 
algorithm  is  the  focus  of  Chapter  V. 


As  mentioned  earlier,  the  transient  behavior  of  the  EOBE  algorithm  has  been 
observed  to  be  superior  to  that  of  the  ELS  algorithm  in  a  number  of  examples.  To 
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demonstrate  this,  ten  Monte  Carlo  runs  of  the  EOBE  and  ELS  algorithms  are  performed 
with  data  records  of  50  points  each. The  parameter  estimation  error  at  each  instant.  (9’'-0 
(t)  (0^-6  (t)  )  and  the  a  priori  prediction  error  are  averaged  over  the  ten  runs  and 

displayed  in  Fig.  3.4.  and  Fig.  3.5  respectively.  The  parameter  estimates  of  the  ELS 
algorithm  tend  to  wander  outside  the  stability  region  in  the  transient  stage,  thus  causing 
unacceptably  high  prediction  error  bursts.  The  inherent  stability  mechanism  of  the  ELS 
algorithm,  however,  ensures  that  the  estimates  do  return  to  the  stability  region.  The 
transient  estimation  error  of  the  EOBE  algorithm,  in  contrast,  is  well  behaved. 

Example  3.2  Narrow  band  ARMA  (2,2)  process 
The  output  data  {y(t)}  is  generated  by  the  following  difference  equation 
y(t)  =  1.4  y(t- 1 )  -  0.95  y(t-2)  +  w(t)  -  0.86  w(t-l)  +  0.431  w(t-2) 

Note  that  in  this  case,  condition  (3.3.10  b)  of  Theorem  3.1  is  violated.  The  noise 
sequence  is  uniformly  distributed  in  [-1.0, 1.0),  as  in  the  fost  example.  The  upper  bound 
Y  ‘  was  set  equal  to  25.0.  The  average  squared  AR  and  MA  parameter  estimation  errors 
are  calculated  over  twenty  five  runs  and  plotted  in  Fig.  3.6  and  Fig.  3.7  respectively.  The 
average  number  of  updates  was  78  for  1(X)0  point  data  sequences. 

For  this  example  too.  different  values  of  the  upper  bound  were  used  and  no 
significant  difference  in  the  quality  of  estimates,  number  of  updates  or  convergence  rate 
was  obser\'ed.  Thus,  it  is  verified  once  again  that  a  precise  knowledge  of  the  upper 
Dound  IS  not  a  prerequisite  for  satisfactory  performance  of  the  algorithm. 

Example  3.3  .ARMAX(3.3.2)  process 

y(t)  =  0.6y(t-I  I  -0.3y(t-2)  -i-  0.25y(t-3)  -t-3.8u(t)  -1.8utt-l  i  -f().7u(t-2) 
w(t)  -H  0.4w(t- 1 )  -0.1  w(t-2) 
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The  measurable  input  sequence  {u(t))and  the  non-measurable  input/noise  sequence 
{w(t))  are  uncorreiated  white  noise  sequences  uniformly  distributed  in  [-1,1].  The  EOBE 
algorithm  can  be  used  for  ARMAX  parameter  estimation,  by  simply  increasing  the 
dimension  of  the  parameter  vector  and  extending  the  regressor  vector  to  include  the 
observable  inputs.  The  analysis  performed  in  the  previous  section  is  still  valid.  In  the 
ARMAX  estimation  problem,  if  estimates  of  the  MA  coefficients  (the  coefficients  of 
C(q‘*)  ),  are  not  required,  then  the  OBE  algorithm  can  also  be  used,  by  modeling  the 
svstem  with  an  ARX  model.  For  the  EOBE  algorithm,  72=10  and  a2(0)=10  .  For  the 
OBE  algorithm,  72=  2.5,  and  o2(U)=10.  P(())  =  1.  in  both  cases.  Ten  Monte  Carlo  run.s 
of  1(XX)  data  points  each  were  performed  for  the  EOBE  and  OBE  algorithms.The  average 
final  parameter  estimation  error  for  the  EOBE  algorithm  was  0. 157  and  the  corresponding 
error  for  the  OBE  algorithm  was  0.69.  For  the  EOBE  algorithm 

(i)  The  sample  mean 

E[e(1000)]  =  [0.53,  -0.27,  0.24,  3.79,  -1.52.  0.63.  0.45,  -0.07]^ 

(ii)  The  average  final  volume  =  6.2x10*0 

(iii) The  average  final  sum  of  semi-axes  =  498.47 

(iv) Tne  average  number  of  updates  =  61 
For  the  OBE  algorithm 

(i)  E[e(1000)]  =  [0.76,  -0.42,  0.27,  3.82,  -2.41.  1.1  1]T 
(  ii)  The  average  final  volume  =  224.25 
(iii)The  average  final  sum  of  semi-axes  =  112.47 
('iv)T7ie  average  number  of  updates  =  128 

Thus  the  parameter  estimates  of  the  OBE  algorithm  are  biased.  For  the  OBE 
algorithm,  the  noise  v(t )  =  C(q‘*)[w(t)j,  and  the  correlation  in  v(t)  is  responsible  for  the 
bias.  The  EOBE  algorithm  in  contrast,  yields  unbiased  estimates,  however  the  confidence 
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regions  for  the  parameters  are  larger  than  the  confidence  regions  yielded  by  the  OBE 
algorithm. 

3.5  Concluding  Remarks 

The  distinctive  features  of  the  EOBE  algorithm  are  fi)  the  discerning  update  strateg}’, 
(ii)  the  uniform  boundedness  of  the  a  posteriori  errors  and,  (iii)  the  fact  that  the  true 
parameters  are  guaranteed  to  lie  in  the  bounding  ellipsoids  at  everv'  time  instant. 
Unfonunately.  the  size  of  these  bounding  ellipsoids  is  very  sensitive  to  the  choice  of 
threshold  Y“-  Simulations  show  that  the  threshold  can  be  taken  much  smaller  than  the 
theoretical  minimum,  thereby  obtaining  tighter  confidence  regions,  without  causing  the 
true  parameter  vector  to  slip  out  of  any  of  the  sets  Sf.  In  an  effon  to  obtain  smaller 
ellipsoids,  the  MVS  algorithm  was  extended,  just  as  in  Section  3.2.  Unfonunately  on 
account  of  the  more  complex  optimization  criterion  used  in  the  MVS  algorithm,  the 
counterpans  of  Theorem  3.1  and  Theorem  3.2  could  not  be  derived.  Simulation  studies 
indicate  that  the  extended  MVS  algorithm  does  yield  smaller  bounding  ellipsoids, 
however,  the  parameter  pcrimare.s  are  biased.  It  U  conjectured  that  the  observed 
unbiasedness  of  the  EOBE  parameter  estimates  is  due  to  the  optimizanon  cntenon  which 
is  an  upper  bound  on  the  normalized  parameter  estimation  error. 


meter  Error  Li  ( 


Figure  3.6  Mean-squared  AR  parameter  estimanon  error  -  Example  3 


CHAPTER  IV 


FINITE  PRECISION  EFFECTS 


4.1  Introduction 

The  behavior  of  adaptive  filtering  algorithms  in  limited  precision  environments  has 
attracted  an  increasing  amount  of  attention  lately.  The  motivation  is,  perhaps,  the 
discrepancy  between  the  theoretical  infinite  precision  behavior  and  the  actual  performance 
of  the  algorithms  when  implemented  in  hardware  or  simulated  on  computers.  Finite 
word-length  computations  can  cause  numerical  inaccuracy  in  the  results  and  numerical 
instability  [Cioffi,  1987].  In  the  case  of  the  RLS  algorithm,  it  has  been  known  for  quite 
some  time  that  the  recursion  for  the  covariance  matrix  inverse  is  numerically  unstable 
and  several  factorization  methods  have  been  developed  to  stabilize  the  recursion 
fBierman.  1977].  Numeric  instability  can  cause  some  of  the  variables  in  an  adaptive 
filtering  algorithm  to  become  unbounded  fairly  rapidly,  as  in  the  case  of  fast  least- 
squares  algonthms  fCioffi.  1984].  On  the  other  hand,  the  accumulation  of  round-off 
errors  can  cause  the  widely  used  LMS  algorithm,  and  even  the  stabilized  RLS  algorithm, 
to  diverge  after  millions  of  iterations.  In  the  case  of  real  time  signal  processing,  at  a 
sampling  rate  of  8  kHz,  this  amounts  to  only  a  few  minutes  of  processing.  It  is  therefore 
imperative  to  consider  the  effects  of  finite  precision  computations  when  analyzing  or 
designing  adaptive  filter  algorithms. 

The  issue  of  finite  word-length  effects  in  MSPE  algorithms  has  begun  to  receive 
attention  only  recently.  In  [Walter,  1988],  the  potential  numerical  problems  which  can 
arise  with  the  exact  cone  updating  (FCU)  algorithm  are  discussed  and  a  robust 
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modification  is  suggested.  In  this  chapter,  finite  precision  effects  in  the  DHOBE 
algorithm  are  studied  through  simulations  and  by  analyzing  the  stability  propenies  of  the 
error  propagation  in  the  algorithm.  Since  the  update  equations  of  the  DHOBE  algorithm 
are  very'  similar  to  the  RLS  algorithm,  the  potential  problems  which  can  arise  due  to  the 
effects  of  finite  word-length  in  the  RLS  algorithm  are  discussed  in  Section  4.2.  In 
Section  4.3,  a  sensitivity  analysis  of  the  forgetting  factor  determination  formula  in  the 
DHOBE  algorithm  is  performed,  and  a  modification  is  suggested  to  increase  the 
robustness  of  the  formula.  An  analysis  of  the  error  propagation  in  the  DHOBE  algorithm 
is  performed  in  Section  4.4  by  studying  the  stability  properties  of  two  coupled  nonlinear 
difference  equations.  The  equations  are  shown  to  be  BIBO  stable  and  consequently  the 
error  in  the  estimates  is  bounded.  The  effect  on  the  bounding  ellipsoid  of  the  errors  due  to 
finite  word-length  computations,  in  one  iteration  of  the  algorithm  is  studied  in  Section 
4.5.  In  Section  4.6.  the  fixed  point  implementation  of  the  algorithm  is  described  and 
simulation  results  are  presented  which  show  that  the  DHOBE  algorithm  yields 
consistently  good  estimates  over  a  large  range  of  word-lengths.  In  particular,  the 
performance  is  superior  to  that  of  the  RLS  algorithm  for  small  word-lengths. 


4.2  Finite  Word-length  Effects  in  the  RLS  Algorithm 


The  update  equation  for  the  parameter  estimates  of  the  RLS  algorithm  with  forgetting 
factor  A  is  given  by 

d(n  =  d(!-I)  +  ^U)0(r)[y(r) - 0^(r)d(r -  I)]  (4.2. 1 ) 

where  the  Kalman  gain  A'(t)  is  defined  by 

F(/-l)0(!) 


K(i)  = 


A.  +  0‘  l)tt>(t) 


(4.2.2) 


and  the  matnx  update  equation  is  given  by 
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P{t)  =  - 

K 


P(f-l)0(f)(D^(r)P(r-l) 
X  +  0'^  {t)PU-\)Oit) 


(4.2.3) 


Due  to  round-off  errors,  the  subtraction  involved  in  (4.2.3)  may  cause  f’(r)to  become 
indefinite  (neither  positive  nor  negative  definite).  The  resulting  least-squares  estimates 
will  be  incorrect.  This  sign  change,  however,  does  not  usually  result  in  overflow  [Cioffi, 
1987].  The  matrix  recursion  can  be  stabilized  by  replacing  it  with  a  set  of  recursions  that 
propagate  the  upper-diagonal-upper  transpose  (UDU^^)  factorizadons  of  the  matrix  P{t) 
[Bierman,  1977].  These  recursions  ensure  that  the  updating  of  the  diagonal  matrix  D(t) 
maintains  positive  diagonal  entries  thereby  ensuring  that  Pit)  remains  posirive  definite. 


The  homogeneous  difference  equation  for  the  error  in  the  estimates  of  the  RLS 
algorithm  is 

e(r)  =  0(r)-0*  =  [/-/f(r)C)(r)<DT'(r)]0(r_i)  (4.2.4) 

It  is  easy  to  show  that  (4.2.4)  is  exponentially  stable  if  ^  <  1  and  P(t)  is  unitormly 
bounded  [Ljung,  1985]  .  If  X  =  I,  then  only  asymptotic  but  not  exponential  stability  can 
be  concluded.  Round-off  and  quantization  errors  appear  as  inputs  to  (4.2.4).  If  the 
forgetting  factor  is  equal  to  unity,  then  these  errors  can  cause  the  estimation  error  to 
become  very  large.  This  observation  is  confirmed  by  the  detailed  statistical  analysis  in 
[.Ardalan.  1987],  Thus  even  when  the  parameters  are  not  rime  varying,  it  is  advantageous 
from  a  numerical  point  of  view  to  use  a  forgetting  factor  less  than  unity. 

Even  after  stabilization,  the  RLS  algorithm  (with  ^  <  1)  may  exhibit  long  term 
instability,  if  the  input  is  spectrally  ill  conditioned.  A  heuristic  explanation  for  this 
phenomenon  fCioffi,  1987],  for  the  case  of  a  FIR  adaptive  filter  adapting  to  a  moving 
average  system  model  is  as  follows.  Assume  that  the  input  x(t)  to  the  adaptive  filter  is 
bandlimitcd,  i.e.  its  Fourier  transform  .satisfies 


X(eJ^ri  =  non-zero,  for  Ico-  cool  <  Wb 


and 
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X(ei“)  =  0  ,  for  Ito  -  coqI  ^  o)b 

If  the  filter  coefficients  of  the  adaptive  filter  0(t)  vary  very  slowly  with  time,  then  a 
Fourier  transform  ©(eJ^^)  of  the  filter  coefficients  could  be  defined.  Then 


0(ej“)X(ej‘^)  =  non-zero,  for  Ico  -  coqI  < 


and 

0(eJ“)X(eJ‘^)  =  0  ,  for  ICO  -  cool  >  cob 

Over  the  frequency  range  Ico  -  coqI  ^  cOb ,  the  filter  response  0(eJ“)  can  take  on  very  large 
values  without  affecting  the  output  error  and  hence  the  adaptive  filter  does  not 
compensate  for  this  growti..  This  growth  then  leads  to  an  overflow  of  the  registers 
containing  the  filter  coefficients.  This  unbounded  growth  can  be  avoided  if  a  small 
diagonal  constant  is  added  to  P(t)  at  eveiy  iteration,  or  to  the  diagonal  matrix  in  the 
UDU”^  implementation. 


4.3  Sensitivity  Analysis 

In  this  section,  the  effects  of  small  penurbations  in  the  inputs  to  the  updating  gain 
determination  formula  (2.3.8)-(2.3.I4)  are  studied.  If  the  resulting  penurbation  in  /i^is 
small,  then  the  change  in  the  estimates  P(t),  0(t)  and  a^lt)  is  also  small.  This  can  be  seen 
by  examining  the  update  equations  of  the  algorithm.  From  (2.3.4),  it  is  easy  to  see  that 

P'kt)  is  a  continuous  function  of  .  The  partial  derivative  of  P'*(t)  with  respect  to  \  is 

3P  ^ ( t )  1  T 

— ^  =  -p-‘(t-l)  +  0(t)cI)^(t) 

oAt 

Ass'’me  that  there  exist  constants  Mi  and  M2  such  that 

MiI<P->(t)<M2l 

and  assume  further  that  II  0(t)  II  is  bounded,  i.e.  the  ARX  process  is  stable  and  has 
bounded  inputs  then 
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ap>(t) 

aXt 


Ik  oo 


The  perturbation  AP'*(t)  is  thus  small  if  the  penurbation  in  ^  is  small.  It  is  shown  in  the 
next  section  that  AP(t)  =  P(t)AP'*(t)P(t).  Hence  the  penurbation  in  P(t)  will  be  small. 
Similarly,  (2.3.6)  shows  that  0(t)  is  a  continuous  function  of  and  that  the  derivative  is 
bounded  provided  the  eigenvalues  of  P(t)  are  upper  bounded  and  llO(t)  li  is  bounded. 
Hence  the  penurbation  in  6(t)  will  also  be  small.  The  derivative  of  a^Ct)  in  (2.3.5)  is 

f2/',\  /-I  "1  \2  1  2/ 


(4.3.1) 


Using  the  fact  that  0  <  <  a  <1,  it  is  relatively  straightforward  to  show  that 

1-a  (l-X+>.^G(t)  f 


The  derivative  of  aHt)  is  thus  bounded,  and  so  the  resulting  penurbation  in  a2(t)  is  also 
small. 

To  b  in,  a  case  in  which  a  small  change  in  a^(t-l)  and  5^(0  can  cause  a  large  change 
in  the  updating  gain  factor  is  considered.  In  the  subsequent  discussion,  the  quantities 
computed  with  finite  precision  are  denoted  by  primes.  It  is  also  assumed  that  the 
penurbations  in  the  inputs  to  the  updating  gain  factor  are  small,  i.e.  there  exists  a  positive 
A  «  1  such  that 

|a’2(t- 1  )-a2(t- 1  )l  <  A,  |5'2(t)  -  52t)l  <  A,  IG’(t)-G(t)l  <  A  (4.3.2) 

Consider  the  following  scenario 

a2(t-l)  +  5^(0  <  but  a’2(t-l)  +  6’2(t)  >  y2  (4.3.3) 

In  this  case  A(  =  0,  and  if  5'2(t)  =  0,  then  A',  =  a.  Thus  the  penurbation  in  the  updating 
gain  factor  AA{  =  X\  -AfCan  be  substantial.  It  can  be  shown  easily  that  if  (4.3.3)  holds 
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then  there  is  a  significant  difference  between  and  only  if  5’2  (t)  is  small. 
However.  which  is  the  quantity  being  minimized,  is  then  marginally  affected  by  the 
change  in  .  This  is  because  the  derivative  in  (4.3.1)  is  close  to  zero  since  (4.3.2)  and 
(4.3.3)  imply  that  5^(1)  and  the  difference  (©^(t-l)  -  y2)  is  small.  Thus  though  the 
penurbation  is  large,  the  difference  in  the  calculated  value  of  o^(t)  will  be  marginal. 
The  large  perturbation  will  cause  a  large  perturbation  in  the  matrix  P(t),  though  the 
perturbation  in  0(t)  will  be  small  since  52(t)  is  small.  In  general,  the  formula  (2.3.8)- 
(2.3.14),  can  cause  large  perturbations  in  which  only  marginally  affect  the  resulting 
value  of  o2(t),  if  the  following  condition  holds  for  some  E.  suitably  small 

!  y-- a“(t- 1  )l  <  £.  and  5^(4)  <e  (4.3.4; 

In  order  to  make  the  updating  gain  formula  more  robust,  it  is  modified  by  adding  the 
following  additional  condition  to  (2.3.10)-  (2.3.14) 

If  I  y--  a2(t-l)l  <  e  and  52(t)  <  e,  for  some  suitably  small  e  (4.3.5) 

then  =  0 

For  the  modified  algorithm  if  the  situation  of  (4.3.3)  occurs  then  the  penurbation  in  A( 
would  be  small  and  the  resulting  value  of  o'~(t)  would  be  almost  optimal.  The  same  is 
true  if  a^lt-D-i- S-it)  >  y-  ando'^O-l  )+ 5'‘(t)  <  y- .  However,  the  modified  formula 
can  in  some  cases  still  cause  a  large  penurbation  Aa,.  For  example  if  I  y-  -  G  -  (t- 1  )  I  <  £ 
and  52(t)  <  E  and  I  y  -  -  o2(t-l  )l  <  e  and  5'2(t)  >  e,  then  Xi  =  0  and  A.'i could  be  as  large 
as  a.  But  since  5'2(t)  is  no  longer  negligible,  it  is  clear  from  (2.3.5)  that  the  decrease  in 
G^(t)  is  not  insubstantial.  In  general,  it  is  easy  to  show  that  for  cases  in  which  the 
modified  formula  yields  a  large  value  of  A^t,  the  calculated  updating  gain  factor  causes  a 
significant  decrease  in  G“.  Thus  the  modification  is  a  compromise  between  minimizing 
G^(t)  and  reducing  the  penurbation  in  the  updating  gain  factor.  In  contrast  using  (2.3.8)- 
(2.3. 14 1,  without  any  modification,  could  cause  large  A?li.  thus  causing  large 
perturbations  in  0(t)  and  P(t),  while  decreasing  o^only  marginally. 
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The  sensitivity  of  the  modified  formula  to  penurbations  in  its  inputs  is  now  evaluated, 
assuming  that  the  situation  of  (4.3.4)  does  not  occur  for  the  perturbed  and  unpenurbed 
algorithms,  since  it  has  already  been  observed  that  the  modified  formula  can  be  quite 
sensitive  if  (4.3.4)  holds . 

Theorem  4.1.  If  the  perturbation  in  a2(t-l),  G(t)  and  52(t)  is  less  than  A.  for  some 
small  positive  number  A<  E,  where  £  is  given  by  (4.3.4),  and  if  (4.3.4)  does  not  hold, 
then  the  penurbation  in  is  of  0(A)  provided  that  a  <  0.414. 

Proof;  The  penurbation  in  the  updating  gain  factor  is  evaluated  assuming  that 
o2  (t-1)  +  52(t)  >  y-  anda'2(t-l  )+5'2(t)>Y2.  From  (2.3.1 1)-(2. 3. 14).  it  can  be  seen 
that  there  are  a  total  of  8  distinct  cases  in  which  there  can  be  a  perturbation  in  the  updating 
gain  factor; 

Case  1:  6'2(t)=  0  and  G(t)  =  1; 

Case  2;  5'2(i)  and  l+P(t)(G(t)-l )  >  0: 

Case  3;  5'2(i)=0  and  l+(3(t)(G(t)-l)<  0; 

Case  4;  G'(t)  =  1  and  G'(t)  =  1; 

Case  5:  G'(t)  =  1  and  l+P(t)(G(t)- 1 )  >  0; 

Case  6:  GCt)  =  1  and  l+P(t)(G(t)-l)  <  0; 

Case  7;  1  +p'(t)(G'(t)- 1 )  >  0  and  1  +p(t)(G(t)- 1 )  >  0; 

Case  8:  l+P'(t)(G'(t)- 1)  >  0  and  l+P(t)(G(t)-l )  <  0 

The  penurbation  in  each  case  is  evaluated  now. 


Case  1.  5'^(i)  =  0  and  G(t)  =  1: 


Since  it  is  assumed  that  (4.3.4)  does  not  hold,  it  follows  from  (2.3.11)  that  A't=  a.  The 


assumption  that  the  perturbation  in  the  inputs  to  the  formula  is  less  than  A  would  imply 
that  52(i)  <  A<  E.  Hence  P(t)  = 

52(t) 

Vt  =  — =5-^  >  0.5.  If  a  <  0.5,  then  A.t  =  a  ,  and  so  the  perturbation  AXt=0. 

Case  2.  S'^it)  =  0  and  l+(i(t)(G(t)-l)  >  0: 

It  has  been  shown  above  that  P(t)  <  -1.  This  together  with  l+P(t)(G(t)-l  )  >  0.  would 

imply  that  G(t)  <  2.  Using  (2.3.13),  it  can  be  shown  that  when  l+p(t)(G(t)-l)  >  0,  then 

v.  > - ^  -  ,  i.e  Vt  ^  0.414.  Hence  if  a  <  0.414.  then  At  =  a  .  and  so  the  perturbation 

l-i-vG(t) 

AXt  =  0. 

Case  3.  5'2(t)  =  0  and  l+P(t)(G(t)-l)  <  0: 

From  (2.3.14)  Xt  =  a,  and  hence  the  penurbation  A?ii=0. 

Case  4:  G’(t)  =  1  and  G'(t)  =  1: 

r  U-  ^  e-  •  •  ^  . 

In  this  case  v  t  =  — 9 —  and  Vf  =  — —  .  Since  it  is  assumed  that  the  situation  ot 

(4.3.4)  does  not  occur,  the  values  of  P(t)  and  P'(t)  will  differ  by  an  amount  greater  than 
0(A)  only  if  b^lt)  is  small.  But  then  both  P(t)  and  p'(t)  will  be  large  negative  numbers 
and  so  A't  =M  =  ot  . 


Case  5:  G'(t)  =  1  and  l+P(t)(G(t)-l)  >  0: 

Let  G(t  )  =  1+  ri  ,  where  Iri  |  <  A.  Then  from  (2.3.13),  Vt  = 

-P(t 


1+Tl 


l+TlP(t) 


Hence  V’t  = 


■+  OCT)  2) ,  Thus  as  in  Case  4,  the  perturbation  will  be  of  0(A). 


Case  6:  G'(t)  =  1  and  l+p(t)(G(t)-l)  <  0: 
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From  (2.3.14),  Xt  =  a  .  Since  IG(t)  -  G'(t)l^  A,  therefore  either  G(t)  =  1+  ri  or  G(t)=  1- 
rj,  where  T)  <  A.  If  G(t)  =  1+  r\,  then  l+P(t)(G(t)-l)  <  0  would  imply  that  P(t)  <  —  ,  i.e. 

5^(1)  =  0(7]).  Hence  5'2(t)  <  A  +  0(ri)  and  therefore  3'(t)  «  -1.  Thus  X't  =  =  oc  • 

On  the  other  hand  if  G(t)=  1-  r]  then  l+P(t)(G(t)-l)  <  0  would  imply  that  P(t)  >  1, 

T1 

which  contradicts  the  assumption  that  a  ^  (t- 1 )  +  52(t)  >y^. 

Case  7:  l+p'(t)(G'(t)-l)  >  0  and  l+P(t)(G(t)-l)  >  0: 

An  expression  for  the  perturbation  Avt  is  obtained  by  evaluating  the  partial  derivatives  of 
Vt ,  from  (2.3.13).  with  respect  to  P(t)  and  G(t).  For  brevity,  the  time  suffix  is  dropped 
in  the  expressions  below 

Av  =  —  AP  +  —  AG 
ap  dG 

where 

d\>  _  _ VG _ 

ap  “  ’(2[1+  p(G-l)]3/2 

and 

^  -G  4-  pG  +  p  -  2PG  -  -  1 
aG  ~  2VG(G-1)2[1+  P(G-1)]3/2 


It  can  now  be  assumed  that  there  exists  a  positive  r|,  suitably  small  such  that 

IG-ll>ri,  and  l+p(G-l)>r|. 

This  is  because  if,  say,  IG-ll  <  7\,  then,  as  in  Case  5,  V  =  v  =  and  therefore  as 

discussed  in  Case  4,  Aa  is  small.  If  0  <1+P(G-1)  <  ri,  then  from  (2.3.13)  v'»l  and 
so  X't  =  At  =  a.  The  above  assumptions  along  with  the  assumption  that  G(t)  is  bounded 
ensure  that  the  panial  derivatives  are  bounded  and  hence  there  exist  K]  and  K2  such  that 

lAvtl  <  K]  IAp(t)l+  K2lAG(t)l 
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Thus  if  the  perturbation  AP(t)  and  AG(t)  is  of  0(A)  then  the  perturbation  A>Lt  is  also  of 
0(A). 


Case  8:  l+p'(t)(G'(t)-l)  >  0  and  l+p(t)(G(t)-])  <  0; 

In  this  case,  =  ot.  If  IG'(t)-ll  <r[  for  some  small  number  ri,  then  as  in  Case  6, 
P'(t)  ^  —  and  as  in  Case  5,  »  1,  and  hence  A.'t=  a. 


If  G(t)  differs  sufficiently  from  unity  then  either  l+P'(t)(G'(t)-l)  is  very  small  or  else 
P'(t)  and  p(t)  differ  greatly.  If  l+P'(t)(G’(t)-l)  is  small  then  from  (2.3.13)  v't  >  a,  and 
so  =  a.  On  the  other  hand  P'(t)  and  P(t)  differ  substantially  only  when  5^(t)  is  small 
(assuming  (4.3.4)  does  not  hold).  Then  by  the  same  argument  as  in  Case  2,  =  a.  if 

a  <  0.414. 


VVV 


4.4  Error  Propagation  in  the  DHOBE  Algorithm 

The  error  propagation  properties  of  the  DHOBE  algorithm  are  analyzed  by  focusing 
on  the  propagation  of  a  single  error  in  6(t)  and  P(t)  to  future  instants.  Assume  that  at  time 
instant  Iq  there  is  a  perturbation  due  to  round-off  error  in  the  estimates  0(tQ)  and  P(t{)).  so 
that  0'(to)  =  0(to)  +  A0(t())  and  P'(to)  =  P(to)+AP(tQ),  where,  the  primed  quantities  are 
the  penurbed  ones.  The  problem  considered  here  is  the  effect  of  these  errors  on  the 
estimates  0'(t)  and  P'(t)  at  t  >  tq,  assuming  that  the  computations  are  performed  with 
infinite  precision.  Similar  studies  have  been  performed  by  Ljung  and  Ljung  [Ljung. 
1985)  in  their  investigation  of  the  error  propagation  propenies  of  RLS  algorithms. 
Though  the  update  equations  of  the  DHOBE  algorithm  are  similar  to  those  of  the  RLS 
algorithm,  the  presence  of  the  updating  gain  factor,  which  is  a  discontinuous  function  of 
the  estimates,  complicates  the  analysis.  In  particular,  the  results  on  penurbed  linear 
differential  equations  (used  in  [Ljung,  1985)  )  cannot  be  applied.  Error  propagation  in  the 
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DHOBE  algorithm  is  investigated  by  performing  a  first  order  penurbation  analysis  of 
two  coupled  nonlinear  difference  equations.  The  analysis  yields  coupled  difference 
equations  whose  homogeneous  parts  are  exponentially  stable.  An  upper  bound  on  the 
error  in  the  estimates  due  to  finite  precision  computations  is  given  by  the  following  result: 

Theorem  4.2.  If  the  following  assumptions  hold: 

(i)  The  matnx  P(t)  is  well  conditioned,  i.e.  there  exist  positive  r],  and  r],  such  that 

0  <  ri  j  I  <  P(t)  <  T|2  I  for  all  t  (4.4.1) 

(ii)  Tlie  ARX  process  is  stable  and  has  bounded  inputs. 

(iii  )  The  unpenurbed  algorithm  yields  bounded  prediction  errors, 

(iv  )  For  some  integer  M,  if  the  unpenurbed  algorithm  has  M  updates  in  an  interval  of 
time,  then  the  penurbed  version  updates  at  least  once  in  that  interval, 

(v)  .At  the  updating  instants  of  the  penurbed  algorithm,  a  lower  bound  p  is  set  for  the 
updating  gain  factor  V-  where  p  is  a  suitably  small  positive  number. 


Then  there  exist  constants  mj  and  h  which  depend  on  the  bounds  on  the  prediction 
cnor  of  the  unpenurbed  algonthm  and  the  inputs  and  outputs  of  the  process  such  that  the 
error  between  the  penurbed  and  unperturbed  quantities  at  the  updating  instants  {tj^}  of  the 
perturbed  algorithm  is  bounded  as 

ilAP(t.,)ll  <  (H:  )~(l-p)'‘^'^'‘ IIAP(0)ll  +  Ti;  m  max  A>l  M  '  - -  (4.4.2) 

rij  “  i<uik  p 

kA1-l  U  ,A->  I^ti  /I  , 

ilAGlt,  )ll  <  (1-p)  II  Ae(n)ll+BAhmaxlA?L,  |  — ll-(l-p)  |  + 

0  ^  ■;  p 


+ 


max  IIAP(t  )li 

l<(<k  ' 


(4.4.3) 


where  ri|  and  132  are  as  in  (4.4.1). 
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Proof ; 

Define  R(i)  =  P  ’(t).  Then 

R(t)  =  (1-Xt)  R(t-I)  +  0^(1)  (4.4.4) 

Assume  that  R(t)  and  0(t)  are  perturbed  by  an  infinitesimal  amount  at  time  instant  t^.  TTie 
penurbed  quantities  at  any  time  instant  t,  will  satisfy 

R'(t)  =  d-Xj’)  R'(t-l)  +  Xj’O(t)  a>T(t)  (4.4.5) 

where,  as  before,  the  perturbed  quantities  are  denoted  by  primes.  Subtracting  (4.4.4) 
from  (4.4.5)  it  follows  that 

AR(t)  =  d-Xt')  AR(t-l)  -  AXt  [cb(t)  OT'd)-  R(t-l))  (4.4.6) 

where 

AR(t)  =  R’(t)  -  R(t)  and  AX^  =Xt'  -  Xj,  (4.4.7) 

The  difference  equation  (4.4.6)  can  be  expressed  as 

C 

AR(t)  =  nn-X.')AR(tQ)  + AX^((D(t)<D'^(t)  -  R(t-l) )  +  I(t)  (4.4.8) 

l=to  +  l 

where 

1-1  f 

l(t)  =  T  n  (l-X;)AX.fO(t-j)cI)  (t-j)-  R(t-j-l)  ]  (4.4.9) 

The  summation  in  (4.4.9)  can  be  taken  over  the  subsequence  (t^,  u  =  1,  2,..)of  instants 
at  which  updates  are  performed  for  either  the  perturbed  or  the  unpenurbed  algorithm. 
This  is  because  at  all  other  instants  i ,  AXj  =  0-0  =  0.  Also,  by  the  assumptions  (i)  and  (ii) 
of  Tbeorem  3.2,  there  exists  a  constant  mj  >  0,  such  that  for  all  t, 

||0(t)0T(t)-R(t-l)ll  <mi  (4.4.10) 

Thus  at  any  instant  t|^  e  { tu ) 

k-l 

III(t^)ll  <  m,  X  n  d-X;)lAXJ  (4.4.11) 

u=I  r=u-»-l 
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Now  assumption  (iv)  of  the  theorem  implies  that  every  M  consecutive  elements  of  the 
subsequence  {t^}  contains  at  least  one  instant  at  which  the  perturbed  algonthm  performs 
an  update.  Thus  at  least  tkTMj, (where  L.J  denotes  integer  part),  updates  of  the  penurbed 
algonthm  have  occurred  at  time  instant  tj^^.  Then  using  assumption  (v),  (4.4.11)  can  be 
expressed  as 

ill(t,)ll  <m,  max  lAX  lfM-1  +  M  (1-p  +  (1-p)^  ...+ tl-p)' ]  (4.4.12) 

‘  i<u<k-l 

For  example  if  k;=10  and  M=3.  then  in  the  worst  possible  case.  i.e.  one  in  which  the  right 
hand  side  of  (4.4.11)  would  be  the  largest,  there  would  be  only  one  update  in  the 
penurbed  algonthm  for  every  three  instants  of  {t^l.  Hence  from  (4.4.1 1)  an  upper  bound 
on  II  I(t|^)  II  would  be 

111(1)11  <  m,  maxlAX  I  [l  +  l+3(l-p)  +  3(l-p)‘ +  2(l-p)'] 

L<i<9  '■ 

Upper  bounding  the  first  and  second  terms  in  the  right  hand  side  of  (4.4.8)  and  using 
(4,4. 12)  then  yields  the  following  expression  for  the  norm  of  the  error  in  the  matrix  R(t) 
at  instants  which  are  updating  instants  for  either  the  penurbed  or  the  unperturbed 
algonthm 

IIARit^)!!  <  ' '  IIAR(t^)ll  +  m^max  |  AU  I  M  ■■■■■■  - -  (4.4.13) 

l<u<k  ^  p 

Thus  the  perturbation  in  R(tk)  is  bounded.  Note  that  no  approximations  w'ere  required  to 
obtain  (4.4.13).  In  order  to  obtain  an  upper  bound  on  the  error  in  P(t),  a  first  order 
analysis  is  performed.  It  is  assumed  that 

IIAR(t)  II  =  0(e),  with  e  «/^j^,^(  R(t) )  where  prefers  to  the  minimum  eigenvalue. 
Then 

P'(t)  =  R'-'(t)  =  (R(t)+AR(t))-'  =  R->(t)  -  R-Ut)  AR(t)  R-'(t)  +  A,  (4.4. 14) 

where  II  A.  II  =  0(  e-  )  .  Neglecting  0(  e’)  terms  yields 
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AP(t)  =  P(t)  -  P(t)  =  P(t)AR(t)P(t) 


(4.4.15) 


II  AP(t)  II  <  ri:'  II  AR(t)  II 


Similarlv,  neaiectins  0(  e-  )  terms  yields 


II  AR(t)ll<  lAii^ll  AP(t)ll 


(4.4.16) 


(4.4.17) 


where  ri]  and  rii  are  defined  in  (4.4.1).  Pinally,  using  (4.4.16)  and  (4.4.1/)  in  (4.4.13; 
yields  (4.4.2). 


From  (2.3.6)  and  (2.3.7),  the  time  recursions  for  6(t)  and  0'(t  )  can  be  e.xpressed  as 


9(t)  =  [  I  -  /.,  L(t)  d>T(t)  I  +  Ut)  y(t) 


(4.4.1 S) 


0'(t)  =  [  I  -  /.('  L’(t)  <I>T(t)  1  +  /.j'L'(t)  y(t) 


(4.4.19) 


where 


Ut)  =  P(t)  0(t) 


(4.4.20) 


Subtracting  (4.4.18)  from  (4.4.19)  and  performing  some  manipulations  yields 
A0(ti=  [I  -  At'L'(t)OT(t)|  A0{t  l)+(AAtL(t)  + V‘^L(t)]  [y(t)-cI)T(t)0(t-l))  (J.4.21) 


A0tti  =  (I  -  /./L'tt)OT(t)|  A0(t-1 )  +  J](t)  +  Jodi 


(4.4.22) 


wnere 


J|(t)  =  AAj  L(t)  (y(t)  -  d>'^(t)0(t-r))  =  A/t,  L(t)  6(t) 


(4.4.23. 


Jod)  =  A.  .AL(l)  (y(t)  -  O''(t)0(t-1 ))  =  A,'AL(t)  6(t) 


(4.4.:3bi 


I  -  >.,  L'(t)OT(t)l  =  ( 1-  p  (t)  p  -i  (t-1  I 


(4.4.24; 


Thcrcinrc  (4.4.22)  can  ne  e.xpressed  as 

A0(ti  ^  I  1-  /.,  )  P'(t)  [’'-I  (t-1  )  A0(t-l  I  -  Jidi  -  Jmd 
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Solving  the  difference  equation  (4.4.25)  yields 

A0(n  =  (1-/..')  P'(l)  P'‘(tj,)A0(t„)  -r  J,(t)  +  J.,(t)  +  Jjt)  +  J_,(ti 

1  =  -1 

(4.4.26) 

where 

t- 1  t 

J4t)  =  ^  P'(t)  F  *(j)  L(j)  5(j) 

j  =  Vi * i  i  =J+' 

(4.4.27a) 

and 

1-  ]  t 

JAt)=  ^  ]^fl-X;)  P'(t)  P'  ‘(j) /-■  ALfj)  5(j) 

i  -  t,,-i  1  =1-1 

(4.4.27b) 

Since  the  perturbation  AR(t)  is  of  order  c,  from  (4.4.15)  the  perturbation  AP.'tj  is  of  the 

same  order.  In  order  to  facilitate  the  analysis,  it  is  assumed  that  A/.[  =  Oie).  Then 

substituting  for  L(j)  from  (4.4.20)  and  neglecting  Ofe-)  terms  yields 

j-  1  1 

j,(t)  =  ^  ^ 5(j) 

J  =  ‘o  ■“  '  =j'^' 

(4.4.28) 

Hence 

t-  !  i 

.!J4t)l!<  Ti,  max  [  ||0(i')ll  i5(j)l  1  ^ 

■  '  ,1  ■^1  ■' 
n  ■'  i=tf,  -1  1=1*1 

(4,4.29) 

Assumptions  (ii)  and  (  iii)  ensure  that  there  exists  a  constant  h  >  0,  such  that 

llOfj)  II.  1  6(j  )l  <  h  <  oo 

Now  .  as  before,  the  upper  bound  (4.4.29)  is  evaluated  at  the  updating  instants  of  the 

perturbed  and  the  unpenurbed  algorithm.  The  summation  of  products  in  id.d.Z'-O  can  be 

upper  bounded  as  in  (4.4.13)  by 

■ 

^  FT*  1 )  A>.  :2  max  lA/.,  1  (  M  -  1+ .\l('l-p (  l-pT  *1  1- 

.  ;  ,  Irtuek  1 

1  lerx  c 
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1  (1 

IIJ,(t,)li +  IIJ^(t,)ll<  ii^h  max  lAll  M— — -  (4.4.30) 

'  ^  -  i<j^k  J  p 

Since  AL(j)  =  AP(j)0(j),  after  neglecting  second  order  terms  (4.4.27b)  becomes 

t-i  t 

Jj(t)=  ^  a'  P0)P‘(j)  AP(j)cD(j)5(j)  (4.4.31) 

J  =  t^-rl  i  =J  +  1  ^ 


Thus 

ri  *■'  ‘ 

llJdt)ll  +  II  J,(t)ll  <  h— maxll  AP(j)ll  1  y  X'  lll-X.')  +  X'  ]  (4.4.32) 

J=in-^>  ‘=J+' 


In  .Appendix  4.A,  it  is  shown  that  the  term  in  the  square  brackets  in  (4.4.32;  is  less  than 
unity  and  so  combining  (4.4.26),  (4.4.30)  and  (4.4.32)  yields 

liAe(t')!l  <  (l-p)‘^^‘’|IAe(tJII  +  n.hmaxlAX  I  — Il-(1-P)“'^''’d  + 

^  0  -  i<j<i^  p 


T1. 

h—  max  IIAP(t.)ll 

Tlj  l<j<k  J 


Thus  the  penurbation  in  the  parameter  estimates  is  also  bounded. 

VVV 


Remarks 

(1)  .Assumption  (iv;  is  a  technical  artifice  which  facilitates  the  analysis.  It  is  highly 
unlikely  that  it  would  ever  be  violated.  Violation  of  this  assumption  would  imply  that  the 
squared  prediction  error  for  the  perturbed  algorithm  is  upper  bounded  by  y-  for  large 
durations  of  time.  Tfien.  p  .v-ided  the  input  and  noise  sequences  are  sufficiently  rich  and 
uncorrelated,  it  is  easy  to  show  that  II  0'(t)  -  6*  II  “  <  Oly-)  for  those  intervals  of  time. 


'2)  .Assumption  (v)  is  a  technical  device  required  to  ensure  that  the  homogeneous  parts  of 
i4  4.2)  and  (4.4.3)  are  exponentially  stable.  If  p  <  O.OOl,  then  in  practice  the  values  of 
at  the  updating  instants  will  usually  be  larger  than  p. 
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(-')  Though  the  analysis  of  error  propagation  has  ignored  the  effect  of  round-off  errors  in 
computations,  since  the  homogeneous  parts  of  (4.4.2)  and  (4.4.3)  are  exponentially 
stable,  the  errors  at  any  time  instant  due  to  round-off  errors  created  at  previous  time 
instants  would  be  bounded  [Ljung,  1985|. 


4.5  Finite  Precision  Effects  on  the  Bounding  Ellipsoid 

In  this  section,  the  effect  of  round-off  errors  (in  one  iteration)  on  the  resulting 
Dounding  ellipsoid  is  studied.  More  specifically,  we  ask  the  question  -  If  d*  e  Ei-i  •  can 
errors  m  the  comnutation  of  El  li.e.  computation  of  0(t).  Ptt)  and  o-(t)  )  cause  0^  e  E[. 
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V(t)-o2(t)  =  (1-X^)  [V(t-l)-o2(t-l)]  +  At  [v2(t)-y2| 

+  2AiTBt  A,  +  AiTA2At+A3  (4.5.8) 

From  the  definition  of  Ei  it  is  clear  that  9*e  £[  iff  V(t)  <  o^ft).  Thus  if  the  errors  Ai, 
At,  and  A3  are  large  enough,  it  is  possible  that  0*  €  Et.  A  sufficient  condition  for  6*  6 
Et  is 

l2Ail'BtAt+AtTA2Atl  <  Xj  [y- -  v2(t)]  (4.5.9) 

In  case  a,  =  0,  then  since  no  update  occurs  0*  e  Et  automatically.  The  condition  (4.5.9) 
shows  that  if  the  errors  due  to  finite  word-length  computations  are  small  enough  then 
0*eEtand  furthermore,  by  setting  y-  higher  than  the  actual  bound  on  the  noise,  the 
robustness  of  tne  algonthm  to  finite  precision  effects  can  be  increased. 


4.6  Simulation  Studies 


Simulation  Setup.  A  fixed  point  implementation  of  the  OBE  algorithm  was  simulated 
by  pert'orming  the  operations  in  integer  arithmetic.  The  input  and  output  obserx'ations. 
which  are  generated  as  floating  point  numbers,  are  converted  to  integers  by  the  formula 

INT(  X.  2‘"“  +  0.5)  .  X  >  0 


X 


quant 


INT(  X.  2'^’“  -  0.5)  .  X  <  0  . 


where  ibii  is  the  number  of  bits  assigned  for  the  integer  representation  of  the  fractional 
part  of  the  real  number  x.  In  the  simulations,  since  an  integer  is  stored  in  32  bits,  all 
registers  and  word  sizes  are  32  bits.  Multiplic.ation  is  performed  by  foiining  the  product 
in  a  48-bit  word,  scaling  down  by  2"'“‘',  and  then  rounding  off  to  the  nearest  integer. 
Inner  products  are  formed  similarly  by  accumulating  the  products  in  a  48-bit  word, 
scaling  down  and  then  rounding  off 


82 


In  order  to  minimize  the  effect  of  round-off  errors  and  finite  word-length  storage,  the 


recursions  of  the  DHOBE  algorithm  are  implemented  as  shown  below 


0(t) 

5(t) 

K(0 


=  e(t-l)  -t-  K(t)5(t) 

-  yd)  -  G''(t-l)0(t) 
X,  P(t-l)0(t) 

1  -X.-HX,G(t) 


G(t)  =  ct)T(t)  P(t-l)<D(t) 

P(t)  -_L[  I  -  KiOcDTa)  ]  Pd-l) 


o2(t)  =  (  1-}l[)  CT^Ct-l)-)- Xt  y2  _ 


(l-/.t)  5-(t) 

l-At-t-  Xi  OT(t)P(t-l  )OT(t) 


Notice  that  the  only  difference  between  these  equations  and  (2.3.5)-(2.3.7 )  is  that  the 
parameter  estimate  update  is  performed  using  P(t-l)  and  hence  errors  introduced  in  the 
formation  of  Pit)  do  not  affect  9(t). 


The  upper  bound  a  on  the  forgetting  factor,  has  to  be  chosen  with  care  in  the  fixed 
point  implementation  of  the  OBE  and  EOBE  algorithms.  If  a  is  chosen  greater  than  0.1, 
then  the  elements  of  the  matrix  P  often  increase  rapidly  in  magnitude  and  overflows  can 
occur.  The  reason  for  this  is  that  in  the  initial  stages,  the  optimum  value  of  the  forgetting 
factor  /.  equals  a  fairly  often.  Consequently,  since  1-  X  appears  in  the  denominator  of 
(2  3.7  ).  the  magnitude  of  the  elements  of  P  can  increase  and  cause  overflows.  On  the 
other  hand,  if  a  is  chosen  too  small  then  the  algorithm  takes  more  iterations  to  converge 
and  the  number  of  updates  increases.  A  value  of  a  =0.1  was  found  to  yield  a  satisfactory 
convergence  rate  and  inhibit  overflows  in  the  update  equation  for  P(t). 

In  addition  to  a  .  the  initial  value  G^(0)  has  to  be  chosen  small  enough  to  prevent 
overflows  in  the  subsequent  calculations  of  This  is  because  if,  at  any  time  t.  a-(t-l  i 
is  large  and  5^(t)  is  small  then  P  =  (y^-a^O-l))/  6Tt)  can  become  a  very  large  negative 
number  and  the  product  piG-l  )  can  overflow.  However,  if  overflows  can  be  detected 
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and  a  saturation  value  is  used  for  (3,  then  the  calculation  of  X  will  not  be  affected.  Since  (3 
is  negative  and  large  in  magnitude.  1+P(G-1 1  is  a  large  positive  or  negative  number 
depending  on  whether  G  is  greater  than  or  less  than  unity.  In  case  1+[3(G-1 )  is  positive, 
then  It  can  be  seen  from  (4)  that  v.  is  greater  than  unity,  and  consequently  X=  a.  On  the 
other  hand  if  1+P(G-1)  is  negative  then  a=  a  from  (4).  Thus  large  values  of  oTO)  can  be 
used  if  care  is  taken  to  account  for  overflows  in  the  algorithm  for  calculating  A. 
.'Mtematively,  the  formula  can  just  set  a  =  a  if  5"(t)  is  smaller  than  a  suitably  small 
number,  the  In  the  simulations,  the  initial  (unquantized)  value  (7-(0)  =100. 

For  the  RLS  algorithm,  the  initial  value  P(0)  is  also  imponant.  .Since  the  bias  in  the 
estimates  is  inversely  proponional  to  PfO).  PfO)  should  be  large.  However  if  P(0)  is  too 
large,  then  finite  word-length  effects  can  cause  the  Kalman  gain  vector  K  to  overflow, 
and  the  parameter  estimates  to  grow  exponentially  in  the  initial  stage  [.Ardalan.  1987]. 
Therefore  a  compromise  value  -  P(0)  =101  was  chosen. 

Simulation  Results 

The  performance  of  the  DHOBE  algorithm  is  compared  to  that  of  the  RLS  and 
the  exponentially  weighted  recursive  least- squares  (EWLS)  algonthms.  for  three  different 
processes,  in  all  the  cases,  the  noise  sequence  {v(t)j  and  the  input  sequence]  uitljare 
generated  by  a  pseudo-random  number  generator  with  a  uniform  probability  distnbution 
in  [-l.O.l.Oj.  The  upper  bound  y 2  is  set  equal  to  1.0.  The  parameter  estimates  are 
obtained  by  applying  the  DHOBE,  RLS  and  EWLS  (with  weighting  factor  a  =0.99)  to 
1000  point  data  sequences.  Ten  runs  of  the  algorithm  are  pert'ormed  on  the  same  model 
but  with  different  noi.se  sequences.  The  number  of  bits  used  for  the  fractional  pan.  ihii,  is 
vaned  from  16  down  to  6  bits  and  the  average  of  the  parameter  error  II0(  1  ()()()  i-O'  !'  -  is 
computed  for  each  value  of  ibii. 
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Example  4.1(Fig.4.n  AR(5)  process 

yit)  =  -0.326  y(t-l )  -0.427  y(t-2)  -0.717y(t-3)  -0.288  y(t-4)  -  0.399  y(i-5i  ^  vii) 

It  can  be  seen  from  Fig.  4.1  that  the  pert'ormance  of  the  DHOBE  algorithm  appears  to  be 
constant  as  the  number  of  bits  varies  from  16  to  8.  In  contrast,  the  performance  of  the 
RLS  and  EWLS  algorithn.s  degrades  for  ihii  <  8.  For  the  RES  algonthm  the  P  mam.x 
did  not  remain  positive  definite  in  many  runs  for  ihit  <  8.  Eor  the  EWLS  algorithm,  this 
happened  for  ihit  <12. 


Number  of  bits  (ibit) 

Figure  4. 1  .Average  parameter  estimation  error  for  the  D1  lOBE 
and  RLS  algorithms  for  an  AR(5)  process 

Example  4.2  (Fig.  4.2)  .ARX(2,3)  process 

y(t)  =  1 .6y(t-l  )-0.83ylt-2)-i-{).  14u(t)  -t-u(t-l )  -i-().16u(t-2)  -t-v(t) 

.As  before,  the  average  tap  error  of  the  DHOBE  algonthm  appears  constant  as  lint  \  anes 
from  16  to  8  bits.  The  P  rnatnx  became  negative  definite  tor  ihii  =  6. The  RLS  and  EWLS 
algonthms  do  not  work  well  for  ihit  <  10.  In  fact  P  became  indefinite  I'or  ihit  <  14.  in 


the  EWLS  case. 
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Figure  4.2  Average  parameter  estimation  error  for  the  DHOBE 
and  RLS  algorithms  for  an  ARX(2.3)  process 

Example  4.3  (Fig.  4.3)  ARX(10,10)  process 

The  DHOBE  algorithm  worked  well  for  ibit  >  12.  However  for  smaller  values.  P 
became  indefinite  and  overflows  occurred.  For  the  RLS  case,  P  became  indefinite  for  ihii 
<  14.  In  order  to  study  the  performance  of  the  DHOBE  algorithm  at  smaller  word- 
lengths.  a  LDU'  factorization  of  the  P  matrix  was  performed.  The  DHOBE  update 
equations  are  identical  to  the  update  equations  of  the  weighted  RLS  algorithm  with  weight 
at  =  /.(.  and  forgetting  factor  A(t)  =  (1-A,i)  and  hence  the  UDU'  form  of  the  DHOBE  can 
be  easily  developed  [Ljung,  1983,  pg.  334]  .  The  UDU'  form  of  the  DHOBE  algorithm 
is  then  compared  to  the  UDU'  form  of  the  RLS  algorithm.  The  simulation  results  show 
that  for  larger  word  sizes,  the  performance  of  the  RLS  algorithm  is  supenor.  Eor  smaller 
values  of  ibit,  the  average  parameter  estimation  error  is  about  the  same  for  both  the 
DHOBE  and  the  RLS  algorithms. 
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-a-  OBE 
RLS 

-a-  L'DU'fOBE) 
-o-  UDU'(RLS) 


Number  of  bits  (ibit) 


Figure  4.3  Average  parameter  esumanon  error  for  the  DHOBE 
and  RLS  algorithms  for  an  ARXdO.lO)  process 


Discussions 

Example  4.3  shows  that  the  performance  of  the  UDU'  versions  of  DHOBE  and  RLS 
algorithms  is  comparable  at  smaller  word-lengths.  The  superior  performance  of  the 
straightforward  implementation  of  the  DHOBE  algorithm,  as  compared  to  the  RLS  or 
EVVLS  algorithms  at  smaller  word-lengths  is  therefore  primanly  due  to  the  supenor 
numerical  properties  of  the  recursion  for  the  matnx  P(t). 

The  update  equation  for  the  RLS  algorithm  with  a  forgetting  factor  /.  is 

r  ,  P(t-l)cD(t)0^(t)  1  P(t-l  ) 

P(t)=[l- - —4 -  J-—  (4.6.1) 

A-t-O  (t)P(t-l)(I>(t) 

The  corresponding  equation  for  the  DHOBE  algorithm  can  be  rewntten  as 
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P(t)  =  [  I  - 


P(t-l)<D(t)(D^(t)  j 

1  -j- 

- +  O  (t)P(t-l  )0(t) 

/-i 


P(t-n 


(4.6.2) 


Since  1-  /vj  plays  the  same  role  in  the  DHOBE  algorithm  as  does  K  in  the  RLS  algorithm, 
the  only  difference  between  (4.6.1)  and  (4.6.2)  is  that  the  factor  (1-  A,  )/  appears  in  the 
denominator  of  the  term  within  braces  in  (4.6.2)  as  oppo.sed  to  the  corresponding  term  /. 
in  (4.6.1).  The  degradation  of  performance  occurs  primarily  because  the  term  within 
braces  becom.es  indefinite  on  account  of  round-off  errors.  Since  is  usually  much  smaller 
than  unity,  the  term  which  is  being  subtracted  from  the  identity  matrix  in  (  1.6.2)  is  much 
smaller  than  the  one  in  (4.6.1).  Thus  P(t)  in  the  RLS  algonthm  has  a  greater  tendency  ic 
become  indefinite  than  the  P(t)  in  the  DHOBE  algorithm.  This  observation  has  been 
continued  by  examining  the  eigenvalues  of  P(t),  for  runs  in  which  the  RLS  algorithm 
performed  poorly. 


CHAPTER  \ 


TRACKING  ANALYSIS 


5 . 1  Introduction 

Penormance  analysis  of  adaptive  filtering  algorithms  is  usually  pert'ormed  b\ 
assuming  that  the  unknown  system  which  is  being  modeled  is  time-invanant.  Howe\’er. 
in  actual  nracnce.  adaptu  e  filters  are  often  used  in  time  varying  environmeni.s.  and  hence 
it  is  imoonant  to  charactenze  the  performance  of  these  algonthms  when  the  system  mode! 
parameters  can  vaiy  with  time.  A  considerable  amount  of  attention  has  been  paid  to  this 
problem  in  the  adaptive  filtenng  literature,  with  analysis  being  performed  mainly  for  the 
LMS  arid  RLS  algorithms,  with  varying  amounts  u*' r  ..  See  for  exa.mpie  (VVidrou  . 
1976|.  [Benveniste.  19821,  fElefthenou.  1986],  (Rao,  1988],  [Gunnarsson,  1989j.  It 
wa,'  mentioned  in  Chapter  II  that  the  incorporation  of  a  forgetting  factor  in  the  DHOBE 
aigontnm  is  e.xpected  to  ennance  its  tracking  performance.  In  this  chapter,  the  tracKing 
chtmacterisiics  of  the  algorithm  are  studied  in  some  Gctad,  The  anah'sis  is,  in  some  '.'.a\  s. 
simplified  by  the  assumption  of  bounded  noise.  However,  as  in  the  previous  cnapter,  the 
presence  of  the  data  dependent  updanng  factor  complicates  any  derivation  of  e.xpressions 
for  the  tracking  error  (the  error  between  the  parameter  estimates  and  the  time  \  aiying 
parameters).  Time  vaiyang  parameter  estimation  using  the  LMS  algorithm  has  been 
anaivzed  using  a  deterministic  approach  in  [Anderson.  1983].  The  exponential  stabiiiiy  o: 
the  homogeneous  difference  equation  for  the  error  between  the  parameter  estimates  and 
the  true  parameter  is  used  to  show  bounded  parameter  estimation  error  when  tlic 
parameters  are  slowly  time  varx'ing.  This  approach  could  be  used  icir  me  DHOBI 
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aigonmm  also  but  the  analysis  is  complicated  by  the  possibility  of  a  zero  updating  gain 
factor,  tnough.  an  exponential  stability  tx^pe  propeny  could  still  be  formulated.  Ho\ve\’er. 
ror  the  DHOBE  algonthm  (and  other  membership-set  estimation  algorithms)  n  is  also 
euualiv  imponant  to  ensure  that  the  time  varying  true  parameters  (6*(t)  I  are  contained  in 
the  hounding  ellipsoids  {Ed.  So  instead  of  characterizing  tracking  in  terms  of  the 
parameter  estimation  error,  we  seek  conditions  which  w'ill  ensure  that  the  time  varying 
true  parameter  vector  is  contained  in  the  bounding  ellipsoids.  This  will  also  guarantee 
oounded  parameter  esumation  error.  Section  5.2  discusses  some  necessai  v  ana  suftwiciu 
conditions  for  the  eliip.soids  to  contain  6*(t).  It  is  also  shown  that  if  in  case  a  lump  in  tne 
true  raramieter  vector  0’  causes  it  to  fall  outside  the  nounding  elhp.soid.  liien  pros  uiea  tne 
jump  is  not  too  large,  the  oounding  ellipsoids  will  move  towards  B'  and  e\'entuaii> 
enclose  6*.  In  Section  5.3.  a  rescue  device  is  proposed,  whicii  \siii  guarantee  tne 
existence  of  ellipsoids  in  the  face  of  large  parameter  variations.  Simulation  results  are 
preset. teu  in  Section  5.4  which  show  that  the  DHOBE  algonthm  is  able  to  trick  siou  and 
abrupt  vanations  in  the  parameters.  The  tracking  pert'ormance.  in  terms  of  parameter 
estimation  error,  i.n  in  manv  cases  comparable  to  the  RLS  aigontnm  with  (..'rgeiting 
factor. 

5.2  Neces.sars  and  Sufficient  Conditions  for  1  racking 

As  mentioned  above,  tracking  in  the  context  of  bounding  eiiipsoidai  parameter 
estimation,  will  mean  ensunng  that  the  time  varying  true  parameier  s  ector  is  contained  in 
tne  oounding  ellipsoid.  'I'he  tneorems  below  present  conditions  tor  parameter  tracKing. 

Theorem  5.1  .\  sufficient  condition  torB*(t:  Ei  h 

(B  "  a  -  Bu  -  ;  ' '  P  u  -  1  ifB  *  1 1 1  -  B' t  -  1 1 !  a  G~i !  -  ;  i^,2.  i 

Proot ;  If  H*' :  ‘ -I  I,-. :  men  since  .  .r  Stand  Ti  r  E;  i  St.  it  foiio'A  s  mat  B '  :  l\- 

.XOvi  irom  2,  T  i  ..  B" ; '  -a  i,' .  ■  o  cuuivaient  to  ( 5.2. 1  .. 


90 


Theorem  5.2  0*(t)  e  E[  it  and  only  if 

10  *  1  n  -  0(  t  - 1  t  -  lifB  ^  (t )  -  B(t  -  D)  <  CT“(t  - 1) - — — ('/■  ~  v-(t 

where  \'(t)  is  the  noise  term  from  12.2.1). 


Proof:  Subaacting  0*(t)  from  noth  sides  of  (2.3.6)  yields 

0(ti-6*(t)  =  0(t  - 1 ;  -  0  *  (t )  T /.fP(t)0(  t  )5(  t  i  (5.2.3) 

Then  using  (2.3.4)  it  is  not  difficult  to  show  that 

V(  t )  =  ( 1  -  i[6(  t  -  i  I  -  0  ( t)]P‘‘(  t  -  r)(0(  t  - 1 )  -  0  "  ( t )] 

.  /..(!  )5^(t)  T  1  , 

(T  )-r /.,G(t) 

wnere  \ 'd  was  defined  in  (3.3.1 1 ;.  Using  (2.3.5)  now  yieid^ 

\'i  t  I  -  G‘(  t  I  =  (  1  -  /.(  )[0!  t  -  1  1-0  ^  (t)|P'‘(t  - 1)10(  t  -  1 1  -  0  *  (t  )1 

-r-/.t(\  “(t  )  -  •/-  )  -  I  1  -  X,  )C'(t  -  1)  (  .■'.2..'^  ; 

Since  0"(t I  €  £[.  iff  V(t)  <  (5.2.2)  then  follows. 

vvv 


Theorem  (5.2)  shows  that  by  choosing  y-  to  be  larger  than  the  actual  bound  say  y - 
on  \ i:  Is  possible  to  increase  the  tracking  capability  of  the  aigontnm.  The  next 
tneore::’.  ui\  es  an  upper  bounc  on  tne  maximum  vanatio'i  in  the  parameter-s  for  w  hich 
tracKipc  '.s  tiuarantecei 


Theorem  5.3  If0*(t-li€  Hi- 1.  and  /.i  -  0.  then  0*(t)  €  [>  if 


A(t):T 


,  '■  /.„..1P  (t  -  !  || 

-i .  a‘(t  - 1 1 - - - -\-r  -  x  ‘ml 


/.^jp  't-I'!:  > 


fPlt-ll!" 

ml*  * 


-  s  CT‘  ( t  -  1  ) 


)  P .  2 , 0 


X  nere 


.A(ti  =  0*((.  -  0"‘(t-l ) 


and  /,ni;r,  and  /.,„av  denote  minimum  and  maximum  eigenvalues  resnecti\ei\ 
i'rc'Oi;  I  sine  '.'.2.5’;.  it  is  straiuhtforward  to  snow  that 
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[e(t-n-e*(t)iP''(t-i)[e(t-i)-e*(t)] 

=  V(t-l)4-A'’'(t)P-‘(t-l)A(t)  +  2A'^(t)P'‘a-l)e(t-l)  (5.2.81 

where  6(t-l )  =  0(t-l )  -  G’d-l).  Now  substituting  (5.2.8)  in  (5.2.5)  yields 
V(t  )-  a-(t)  =  (1  -/.t)[V(t  -  r)-CT“(t  - 1)]+  /.,(y'“-Y') 

+(l-/.,)[A'^(t)P'St-nA(t)  +  2A’^(t)P'‘(t- 1)0(1 -D]  (5.2.91 

Since  0’*‘(t-l )  e  Ei.i,  therefore  V(t-l)  <  o2(t-l )  and  thus  a  sufficient  condition  for 
0*(t)  £  El  is 

A'^(t)P'^  t  -  l)A(t )  +  2A'^(t)P'S t  -  l)0(t  - 1)  <  — '—rr  -  'r  (  (5.2, 10) 

1-A, 


i.e.  0*(t)  e  El  if 

X_Jp-‘(t-l)]IIA(t)ll‘+2llA(t)ll  ll0(t-l)ll/.^,,lP''(t-l)l 


■max  I 


-  K 


(-/'  -  V  -  ) 


(5.2.1 1 


Since  )  <  a^(t-l ),  therefore 

Il0(t-l)[r< 


a“(t  - 1) 


>.^,n[F‘(t-l)] 


(5.2.12) 


Substitutine  (5.2.12)  in  (5.2.1 1  j  gives  a  sufficient  condition  for  0*(t)  £  Ei  as 


;._aJP'-(  t  - 1  )|ll  A(t)ll-  +2llA(t)ii^c-(t  -  1) 


/,^a.xfP  (t-I)] 
\^'-minfP’'(t  -  P'l 


■  (  -  V 


Soivine  tnis  Quadratic  ineauaiitv  then  yields  (5.2.6). 


(5.2.1.'i 


vvv 


The  noise  term  v-(t)  in  5.2.6  can  be  replaced  by  y-  to  yield  a  bound  which  can  be 
calculated  at  the  time  instant  i.  If  A(  =  0.  then  the  difference  between  y-  and  y-  cannot  be 
e.xploited  to  increase  the  tracking  capability  of  the  algorithm.  In  fact  in  this  case.  G’d)  e 
Et  iff  0*(t)  £  El-].  Thus  if  0’(t)  jumps  out  of  Ei-j.  and  no  updates  are  performed  at  future 
time  instants  i  .  then  0*(t-)-i)  £  Ei+j  =  Ei-j,  and  the  parameter  may  never  be  tracked. 
However,  it  can  be  argued  that  an  update  will  be  performed  in  a  finite  interval  of  time  . 
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This  IS  shown  heuristically,  by  examining  the  expression  for  the  magnitude  of  the 
predicdon  error 

i6(t)l  =  I  [6(t-l )  -0*(t)]^O(t)  +  v(t)  I 

If  no  updates  are  performed  for  a  large  interval  of  time  say  from  time  instant  t.  to  time 
instant  t  +  N]  then 

I  f0(t-T)-0’(t+i')]Tcp(t+i)  +  v(t+i)  I  <  (y2-o2(t-n]l/-  V  i  =  O.I....N1. 

If  the  input  and  noise  sequences  are  sufficiently  rich,  then  the  regressor  vector  Ou)  will 
span  the  parameter  space  in  all  directions  and  so  [0(t-l  i  -0'(t+i)l'f^O(i^ii  cannot  he 
arbitrarily  small  for  i  e  [0.  Nij.  If  lv(t+i)i  approaches  its  upper  bound  y.  for  some  1  in  the 
same  interval,  and  if  |v(t)|  is  sufficiently  uncorrelated  with  the  input  iuitil.  tnen  tne 
above  inequalim  will  be  violated  and  an  update  will  be  performed. 

If  the  parameter  variation  is  such  that  (5.2.2)  is  violated  then  0*(ti  e  E[.  The  next 
theorem  shows  that  if  0*{t)  remains  fixed  after  its  jump  out  of  and  if  the  jump  is  not 
large  enough  to  cause  the  subsequent  ellipsoids  Ei+i  .  for  i  >  0.  to  vanish,  then  the 
DHOBE  algorithm  guarantees  that  the  true  parameter  will  be  tracked  (enclosed)  in  finite 
time. 

Theorem  5.4  Assume  that  the  parameter  variation  at  time  instant  i.  causes  0*-iti  ^  E: 
Assume  funher  that 

(1)  After  this  vanation  .  the  parameter  remains  constant  (i.e.  the  jump  parameter  case ). 

(2)  0“(t+i)  >  0,  for  i  >  0. 

(3 )  TTie  algonthm  does  not  stop  updating. 

(4)  A  lower  bound  p  is  imposed  on  /tt  at  the  updating  instants. 
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Then  there  exists  a  N  i  >0.  which  depends  on  the  amount  of  parameter  variation  and  the 
actual  and  user  set  noise  bounds  such  that  6*(t)  e  Et+N,  • 

Proof;  Since  0*(t)  e  Ei  ,  define 

Ti  =  fecti-e^itiiP'Vtifeit^-e^itij-CT-ct)  >  o  (5.2.14) 

Assumption  (1)  will  imply' that  A(H  Ni)  =  Ait+I)  =0  for  arbitrary  positive  Ni. 

Substituting  in  ('5.2.9).  and  iterating  yields 

I  '-N.  t+N| 

V(  t  +  N 1 )  -  G'  ( t  +  Nj )  =  r|  (1  -  Aj )  -  X 9>a+N.  [ ^ n  -  Y'  ]  (5.2.15) 

i=t+l  i=t+l 

w'here  qj j  is  defined  n  Appendix  4A.  Assumption  (3)  will  ensure  that  some  of  the  At+i . 
1  >  0.  will  be  non-zero.  Since  the  second  term  on  the  right  hand  side  of  (5.2.15) 
negative,  and  since  the  first  term  is  non-increasing,  the  difference  V(t-(-N]  >  -  o-(t-i-N  j  i 

will  tend  to  zero  as  N'l  increases.  Thus  there  exists  a  .N'l  such  that 

V(t+Ni)  -  o2(t-(-Ni)  <  0  (5.2.16) 

An  estimate  of  the  time  Ni  required  for  the  ellipsoids  to  regain  the  parameter  vector  can 
be  obtained  by  the  following  analysis. 


■Assume  that  there  are  K  updates  in  the  interval  jt-i-l.t-i-N i  j.  From  (5.2. 15 1.  u  is  clear 
mat  me  inequality  (5.2.16)  will  be  satisfied  for  all  K  which  sansty 


Tif[(l-A,  )<£qj 


V-  -  V 


(5.2.17) 


K 

where  (t,)  is  the  sequence  of  updating  instants.  Now  let  R(K)  =  ^q[  ^  .  where  the 

.1=1 


sum  is  taken  over  the  updating  instants.  It  is  easy  to  see  that 

RfK)  =  (1- A[,  )R(K-1)  +  /.[  =RfK-l)+  Ai  ll-R(K-l)! 

K  K  K 

In  Appendi.x  4.A.  it  is  ^hown  that  R(t)  <  1  for  all  t  >  0.  and  by  Assumption  (4i.  >  p, 

hence 


RfK)>R(K-])+p|]-R(K-ll 
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and  so 


Thus 


RfK)>(l-p)R(K-I)  +  p 


RCK)  >  p  +  p(  1-p)  -f  ..+p( l-p)R-'  =  l-( l-p)>^ 

Thus  from  (5.2.17).  K  has  to  satisfy 

ri(l-p)^  <  [I-(l-p)^](Y“-y^) 

Hence  9*(t)  e  Ei+n.  if  the  number  of  updates  K.  in  the  interval  It.t+Ni)  satisfies 

logt^-^^^ — -+ 1) 

K  >  - 2—111 -  ! , 

-log(l  -p  ) 


5.3  .A  Re.scue  Procedure 


TVV 


In  many  cases  when  the  parameter  jump  is  large,  or  if  the  ellipsoid  has  shrunk  to  a 
very  small  size,  the  intersection  of  E[.i  and  Si  can  be  void.  In  that  case.  a^(t)  can  become 
neganve.  thus  indicating  that  a  bounding  ellipsoid  cannot  be  constructed.  To  circumvent 
this  failure  of  the  algorithm,  a  rescue  procedure  is  proposed.  If  at  any  time  instant  t.  a-(t) 
becomes  negative,  then  a2(t-l )  is  increased  by  an  appropriate  amount,  thereby  increasing 
the  size  of  E[.i  .  Then  the  intersection  of  the  larger  Ei.'  '’nd  S,  will  no  longer  be  void,  and 
tnus  an  eliip.soid  Ei  will  be  constructed.  Alternatively.  ,  .Id  be  increased,  to  permit  a 
non-null  intersection.  However,  the  former  procedure  is  preferable  because  it  causes  B(ti 
to  migrate  towards  0*(t).  thereby  reducing  the  parameter  estimation  error. 


There  are  essentially  two  different  cases  which  require  calculation  of  the  amount  of 
increase. 


Case  1 .  1-t- p(t)[G(t)-l  I  >  0  and  Vt  <  a. 

The  quantities  P(i),  G(t).  Vj.  and  a  have  been  defined  in  Section  2.3.  In  this  case, 

t ) 

- -  =  0  .  and  so  as  in  Lemma  3.1, 

d/-.  '  ■ 


95 


a2(t)  +  e2(t)  =  (5.3.1) 

Thus  a“(t)  is  negative  iff  le(t)l  >  y.  Using  (3.3.6)  this  implies  that  o^(t)  is  negative  iff 

(5.3.2) 


1-xt 

On  substituting  for  At  from  (2.3.13),  (5.3.2)  can  be  expressed  as 
.  G(t)-1 

I5(t)l>  — — — — - -y  ifG(t)9tl 


v'G(t)(l  +  p(t)[G(t)-l])-l 


(5.3.3) 


and 


I5(t)l> 


2v 


ifG(t)  = 


l  +  (3(t) 

Using  the  definition  of  P(t)  from  (2.3.17)  in  (5.3.3)  and  manipulating  terms  yields  a 

necessary  and  sufficient  condition  for  o-tt)  to  be  negative,  in  terms  of  a-(t-l  i 
1  ..  yfG(t)-  1)+I5(t) 


c"(t  - 1  )< 


G(t)-1 


5"(t)  +  ynG(t)-l]-- 


VG(t) 


=  K, 


and 


if  G(t) 
(5.3.4) 


a-(t-l)  <52(t)  +  y2_2y|6(t)l  =  Ki  ifG(t)=l 
Thus  the  rescue  procedure  would  replace  o2(t-l)  by  Ki+  where  I  is  a  positive 
constant,  to  ensure  that  a^d)  is  positive.  Using  C  =  1,  has  yielded  satisfactoiy  results  in 
simulations. 

Case  2.  =  a 


In  this  case,  from  (2.3.5  ).  a^d)  is  negative  iff 


S'it)  >  [l-a  +  aG(t)] 


a‘(t-l)  y 
+  • 


Thus  a-it)  is  negative  iff 

( t  - 1 )  <  a' 


5^(1) 


a  1  -a 


=  K- 


l-a  +  aG(t)  1-a 


(5.3.5; 


(5.3.6) 


In  this  case  a^d-l )  would  be  replaced  by  Kt  +  C 


5.4  Simulation  Results 
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The  tracking  properties  of  the  DHOBE  algorithm  are  studied  for  an  ARX(  1,1)  model 

y(t)  =  ay(t-l)+  bu(t)  +  v(t)  (5.4. 1 ) 

The  nominal  values  are  a  =  -0.5  and  /?  =  1.0.  The  noise  (v(t))  and  the  input  {u(t))  is 
generated  by  a  pseudo-random  number  generator  with  a  uniform  distribution  in  [-1,1  ]. 
For  the  DHOBE  algorithm,  a  =  0.2,  =  1.0,  and  c^COi  =  100.  The  parameters  are 

varied  as  follows 

Case  1.  Slow  vanarion  in  the  parameter  vector  from  r  =  1. 

The  parameters  a  and  h  are  varied  by  1%  forever\'  10  samples,  staning  from  the  rirsi 
sample,  and  the  output  data  (y(t))  is  generated  for  t  =1,2... 1000.  The  final  parameter 
estimation  error  is  T.OxlO'^,  the  final  volume  is  3.5x10'^  and  the  final  sum  of  semi-axes 
is  0.52.  All  the  bounding  ellipsoids  contained  the  true  parameter.  The  parameter  esumates 
are  plotted  against  the  true  parameters  in  Figure  5.1.  From  the  figure  it  is  clear  that  the 
DHOBE  algorithm  can  track  slow  time  variations  in  the  parameters. 

Case  2.  Slow  variation  in  the  parameter  vector  from  t  =  500. 

The  parameters  a  and  b  are  vaned  by  \%  for  every'  10  samples,  staning  from  the  500^ 
sample.  The  final  parameter  estimation  error  is  S.OxlO'-"  ,  the  the  final  volume  i> 
5.0x10'“  and  the  final  sum  of  semi-axes  is  0.54.  All  the  bounding  ellipsoids  contained 
the  true  parameter.  The  parameter  estimates  are  plotted  against  the  true  parameters  in 
Figure  5.2.  The  figure  shows  that  the  algorithm  can  track  slow  time  vanations  in  the 
parameters  even  after  it  has  "converged". 

Case  3.  Jump  in  the  MA  parameter  at  l  =  500. 

The  parameter  b  is  changed  by  100%  at  the  500^b  sample,  and  a  is  kept  constant  at  u.n 
nominal  value.  The  true  parameter  vector  is  out  of  the  bounding  ellipsoids  from  i  =  500. 
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to  i  =  530.  after  which  it  is  regained  by  the  bounding  ellipsoids.  The  final  parameter 
estimation  error  is  1.3x10'^  .  the  the  final  volume  is  4.0x10'^  and  the  final  sum  of  semi¬ 
axes  is  0. 14.  The  jump  thus  appears  to  have  resulted  in  bounding  ellipsoids  with  smaller 
sizes.  The  parameter  estimates  are  plotted  against  the  true  parameters  in  Fig.  5.3.  Fig.  5.4 
shows  the  parameter  estimates  obtained  for  this  case  by  applying  the  RLS  algorithm  with 
a  forgetting  factor  Ait)  =  0.9.  Fig.  5.5,  shows  the  estimates  when  the  variable  forgetting 
factor  proposed  by  Fonescue  and  Kershenbaum  [Goodwin.  1983)  is  incorporated  into 
the  RLS  algorithm  .  This  variable  forgetting  factor  A(t),  is  a  function  of  the  prediction 
error  and  is  given  by 


1  +G(t) 


.A  value  of  a'  =  0.01  was  used  because  it  yielded  steady  state  tracking  error  of  about  the 
same  magnitude  as  the  DHOBE  algorithm.  From  these  figures,  it  is  evident  that  the 
DHOBE  algorithm  can  track  jumps  in  the  parameters  at  least  as  well  as  the  exponential!) 
weighted  RLS  algorithm. 

The  effect  of  var\'ing  was  studied.  .A  value  of  y-  =  2  was  taJcen.  In  this  case,  the  true 
parameter  did  not  lump  out  of  the  bounding  ellipsoid  at ;  =  500.  The  parameter  e.stimate.s 
are  identical  to  those  in  Fig.  5.3.  But  the  ellipsoids  are  larger,  as  expected,  with  the  final 
volume  =  3.4  and  sum  of  semi  axes  =  4.08 

For  a  different  run.  i.e.  with  a  different  input  and  noise  sequence,  the  jump  at  i  - 
500.  caused  a-(t)  to  become  negative.  The  rescue  procedure  was  then  used  with 
remarkable  results.  The  true  parameter  was  captured  at  i=  501.  The  final  parameter 
estimation  error  =  2.4x10"^  .  the  final  volume  =  5.8x10'“.  and  the  final  sum  of  semi- 
axes=  0.65.  Fig.  5.6  shows  that  the  parameters  are  tracked  extremely  rapidly  in  this  case. 
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Figure  5.3  DHOBE  parameter  estimates  for  the  case  of  a  jump 
in  the  MA  parameter  at  t  =  500 
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Figure  5.5  RLS  ('Aath  variable  forgetnng  factor)  rdrameter  esumates 
for  the  jump  parameter  case 
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CHAPTER  VI 


CONCLUSION 

This  report  has  focussed  in  on  the  bounding  ellipsoid  approach  to  membership-set 
parameter  estimation.  It  has  been  shown  that  the  OBE  algorithms  often  yield  lower 
estimation  error  in  comparison  to  least-squares  algorithms,  when  the  unknown-but- 
bounded  noise  does  not  satisfy  the  usual  stationary  and  whiteness  assumptions.  The  OBE 
algorithms  are  thus  viable  alternatives  to  conventional  parameter  estimation  algonthms  in 
many  real  life  applications. 

Previous  work  in  the  area  of  membership-set  parameter  estimation  has  concentrated 
on  parameter  estimation  of  models  with  known  inputs  and  outputs.  In  this  repon,  one  of 
the  OBE  algorithms-  the  DHOBE  algorithm-  has  been  extended  to  perform  parameter 
estimation  of  linear  models  with  unknown-but-bounded  inputs.  The  extended  algorithm 
possesses  all  the  advantageous  features  of  the  OBE  algonthms  such  as  a  discerning 
update  strategN .  time  varying  parameter  tracking  capability  and  robustness  to  numerical 
effects.  The  transient  pert'ormance  of  the  algorithm  has  been  observed  to  be  superior  to 
that  of  the  ELS  algorithm  in  simulations.  This  is  particularly  advantageous  when  the 
number  of  data  points  is  small.  Analysis  of  the  extended  algorithm  has  shown  that  the 
aigonthm  yields  100%  confidence  intervals  for  the  parameters  at  every'  sampling  instant. 
The  anaiv.sI^  of  the  extended  algorithm  requires  less  restrictive  assumptions  than  the 
analyses  of  tiie  extended  least-squares  or  recursive  maximum  likelihood  algorithms. 

Analysis  of  the  finite  precision  effects  in  the  DHOBE  algorithm  has  shown  that  the 
algorithm  is  stable  with  respect  to  errors  due  to  finite  wordlengih  computations  and 
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storage.  A  detailed  analysis  of  this  nature  has  never  been  performed  for  any  of  the  other 
existing  MSPE  algorithms.  Furthermore,  simulation  results  show  that  the  matrix 
recursion  involved  in  the  DHOBE  algorithm  is  better  conditioned  numerically  than  the 
corresponding  recursion  in  the  conventional  recursive  least-squares  algorithm. 

The  tracking  characteristics  of  the  DHOBE  algorithm  have  been  studied  in  detail. 
Some  necessary  and  sufficient  conditions  for  parameter  tracking  have  been  derived.  It  has 
been  shown  theoretically  and  through  simulations  that  the  algorithm  can  track  small 
variations  in  the  parameters.  A  procedure  has  been  suggested,  whereby  large  jumps  in  the 
parameters  can  also  be  tracked. 

The  work  performed  in  this  report  can  provide  a  spring  board  for  several  areas  of 
future  research.  The  connection  between  the  OBE  algorithms  and  the  weighted  least- 
squares  algorithm  could  perhaps  be  exploited  to  develop  fast  (  0(N)  )  implementations. 
It  may  turn  out  that  the  numerical  propenies  of  these  implementations  are  superior  to 
those  of  the  existing  fast  least-squares  algorithms.  Another  promising  research  direction 
is  to  recast  the  OBE  algorithms  in  an  output  error  formulation.  The  algorithms  could  then 
be  used  to  obtain  unbiased  estimates  for  ARX  models  with  output  noise,  which  are 
commonly  used  models  in  adaptive  control.  The  use  of  the  OBE  algorithms  in  reduced 
order  modeling  is  also  an  interesting  application.  If  a  bound  on  the  maximum  allow'able 
modeling  inaccuracy  is  specified,  the  OBE  algorithms  could  be  used  to  generate  a  class  of 
reduced  order  models  which  can  approximate  the  unknown  large  order  system.  Thus 
there  exists  a  gamut  of  applications  where  the  OBE  algorithms  can  be  applied  and  in  fact 
prove  to  be  viable  alternatives  to  conventional  parameter  estimation  algorithms. 


APPENDIX  2A 


Derivation  of  the  Bias  Expression  (2.5.4) 


The  system  model  is  an  ARXd.l  )  model 

y(t)  =  .r  y(t-l )  +  do  u(t)  +  v(t)  C. 

where  the  measurable  input  u(t  )  is  white  and  uncorrelated  with  the  noise  v(t).  and 

v(t)  =  A  sin  (COot*  +  ( 1-A  )w(t)  (2 

with  w(t)  being  a  white  noise  sequence. 

Define 

n(t)  =  A  sin  (coot) 

'fhe  predictor  model  is 


(2 


v(t)  =  a  v(t-l )  +  b  u(t) 


(2 


The  RLS  aleonthm.  at  time  instant  t  =  N  minimizes 


N 


-X[y(t)-\-(t)]' 
•  1=1 


The  RLS  solution  thus  satisfies 
1 

—  ^[(.x  -  a)y(  t  - 1)  +  (do  -  b)u(t )  +  v(i)]| 


N 


1=1 


v(t-l)' 


!_  u(t)  j 


1  =  0 


In  order  to  obtain  an  expression  for  the  asymptotic  bias  the  following  definitions  are 
made. 


Let  pit)  and  q(t)  be  two  signals.  Define  the  sample  expectation 


E[p(t)|=  lim 

i\ 


(2 


and  the  sample  cross  correlation 
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1 


N 


Rpq(i)  =  E[p(t-i)q(t)]  =  Jim  — ]£p(t-i)q(i) 


(2A.8) 


provided  the  limit  exists.  Then  from  ('2A.6)  the  RLS  solution  asymptotically  satisfies 


a  =  x  + 


Rw(l) 


Rw(0) 


f2A.9) 


and 

b=bo  (2A.10) 

Thus  the  estimate  of  the  moving  average  coefficient  is  unbiased. 


To  oDtain  an  expression  for  the  bias  in  the  AR  coefficient,  the  sample  correlations  in 
(2A.9)  are  evaluated.  Multiplying  the  L.H.  S.  and  R.H.S  of  f2A.l)  by  yi,t)  .  talcing 
sample  expectations  and  exploiting  the  fact  that  u(t)  is  white  and  uncorrelated  with  v(i  i 
yields 

Ryy  (0)  =  Jr  Ryy  (1)  +  (Q) 

But  since  w(t)  is  white 

Rvv  (0)  =  Rvn  (0)  +  (1— A)2 


Hence 


Ryy  (0i  =  .r  Rvy  (\)  +  hool+  RyTi  (0)  +  (  1-A)2  o; 
It  is  easy  to  show  [Ljung,  1987,  pg.  28],  that 

E[n2(t)j=~, 


(2A.11 


(2A.12) 


and 


A2 

Eindinit-T)]  =  ^coscoqx 


(2A.13) 


.\ow  multiplying  both  sides  of  GA.l)  by  n(t)  and  taking  sample  expectations  yields 

Ryn(0l=.xRyn(n  +  Rnn(O)  i2A.14) 

or  altemativelv 


N  i 


El  v(  t)n(t))  =  lim  v(0'i —  Y  x'  +  lim  — Y  V  A:‘n(t)n(  t  -  i 


1=1 


1) 


1=1 i=i 


(2A.15) 
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Since  Lr  I  <  1  to  ensure  a  stable  system,  the  first  term  on  the  right  hand  side  of  ('2A.15j 

I 

vanishes.  Using  the  fact  that  the  series  ^-t‘n(t)n(t  -  i)  converges,  it  is  not  difficult  to 


show  that 


N  1  N 

E[y(t)n(t)]=  lim  T -v'  — V  n(t)n(t-i) 


(2A.16J 


which  using  (2A.13;  vields 

A“  ^ 

E[y(t)n(t)l  = -  lim  ^Jc'coscooi 

2  N— , 

1=1 

By  expressing  x  as  e“,  the  infinite  sum  in  (2A.17)  can  be  evaluated  to  yield 

A“  l-.rcos(0(, 


E[y(t)n(t')l  = 


2  1  -  2.r  cos  ooa  +  x  ■ 


f2A.17) 


(2A.18) 


And  so  using  (2A.12)  and  (2A.14) 

E[y(t  -  l)n(ti!  =  ’  = 


cos  COo  -  -V 


2  1  -  2j:  cos  (Do -r  .X  ■ 


(2A.19) 


Since  w(t)  IS  white,  therefore.  Ryv(l)=  Rynd)- 

.Now  multiplying  (2A.1)  by  ylt-l) .  and  taking  sample  expectations  as  in  (2A.lli  yields 

Ryy  fl)  =.r  RyyfO) +  Ryn(l)  f2A.20) 

Using  (2.-\.l  1 )  and  (2A.20)  to  solve  for  Ryy  (0)  then  yields 

TV  1  1"  A‘[l  -  ,xcos  Wo]  ,  ^  .  A“1 

Kvv  (0i  = - r- - ^ +  ,|  - :  (2.-\.2l  : 

i-.v*  I  1  -  2,x  cos  Wo  + -V"  ■  2  ; 

Substituting  the  expressions  for  Rw  (0)  and  Rw  ( 1 )  =  Rvn  ( 1 )  in  (2A.9  )  finallv  vields 


a  =  .X 


^[coswo-.x] 


A^ 


+  (^00^  +  n-Ara; 


1  -  2,xcosWo  +  .x“  ' 

- T? - 1 


APPENDIX  3A 


Proof  of  the  updating  gain  formula  (3.2.8)  and  (3.2.9); 


Since  the  optimum  forgetting  factor  minimizes  a2(t),  therefore 

o2(t,  <  o2(t,  0)  =  a2(t-l)  (3A.1) 

and 

^  ^  (l-^,)~  -  a”  G(t) 

-  o"(t-r)  -  S'Ct)  - ^  (3 A. 2 ) 

( i-x,+  AjG(t)  y 

and 


da^d)  2 

-  =  V 

dk 

I 


dV(t)  2  57t)G(t) 

- -  =  - ^  (3Ao) 

dx;  (i-\4-XjG(t)r 

Thus  d^  o^(  t)  /  d  >  0.  unless  dH  t)  =  0  or  G(t)  =  0.  Since  P(t-l)  is  positive  definite. 
G(t)  =  0  iff  0(t)  =  0.  The  algorithm  can  be  modified  to  detect  the  occurrence  of  a  null 
0(t)  and  set  it  to  a  small  non-zero  value,  prior  to  the  calculation  of  G(t).  Thus  it  can  be 


assumed  that  Git)  st  ()  for  all  t.  If  5‘(ti  =  0.  then,  since  a~(0)  <  b>'  ('3.2.7).  atid  sinae 

o^(t)  is  non-increasing,  therefore  a^it- 1  )-t-52(t}  <  y^  and  by  (3A.2).  do^ttl/dA,  is 
positive,  and  hence  a2(t')  is  minimized  if  =  0,  Now  for  the  sequel,  the  second 
denvative  of  oM  t)  can  be  assumed  to  be  positive  and  hence  the  unique  minimum  occurs 
at  d  a‘(  t)  /d/v^  =  0.  From  (3A.2),  if  G(t)  =  1.  a^ft)  is  minimized  if 

/.,*  =  (]- P(t)  )/2  (3A.4) 

Otherwise  if  G(t)  ?tl.  a^(t  )  is  minimized  if 


}.*  = 


G(t) 


G(t) 


1+  (3(t)(G(t)-  1) 


(3A.5) 


Moreover,  in  (3A.4)  and  (3A.5) 
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>  0  <=>  p(t)  <  1  <=>  o2(  t-1)  +  52(  t)  >  (3A.6) 

It  is  easy  to  show  that  1+P(t)  (G(t)-l)  is  always  positive.  Since  0^(0)  <  and  a2(t)  is 
non-increasing,  therefore  p(t)  >  0.  From  (3A.6),  p(t)  <  1,  hence  1-  l/p(t)  <  0.  Then 
l+p(t)  (G(t)-l)  <  0  =>  G(t)<  1- 1/P(t)  =>  G(t)<0 
which  is  a  contradiction.Thus  (3A.5)  would  always  yield  real  X*  .  It  is  now  shown  that 
(3A.4)  and  (3A.5)  yield  values  of  which  are  upper  bounded  by  unity. 

If  G(t)=l,  then  since  P(t)  >  0,  (3A.4)  yields  Xj*  <  1. 

IfG(t)<l,  then?.,*>  1  «  l-[G(t)/(l-t-  p(t)(G(t)-l) )  >  l- G(t) 

ci>G(t)  (1+ p(t)(G(t)-l)  )>  I  (3a.7) 

But  G(t)  <  1  and  P(t)  >  0  contradict  (3A.7).  Hence  if  G(t)  <  1.  then  <  1 .  It  can  be 
shown  in  exactly  the  same  way  that  G(t)  >  1,  would  imply  that  >4*  <  1.  Thus  unlike  the 
case  in  fDasgupta,  1987],  no  upper  bound  has  to  be  imposed  on  the  forgetting  factor 


APPENDIX  3B 


A  Time  Domain  Implication  of  the  SPR  Condition 


A  sufficient  condition  for  convergence  of  the  ELS  algorithm  is  that  the  transfer 
funcnon  H(z'* )  =  [  -  y  ]  be  strictly  positive  real .  This  means 


Re  H(eJ“)  >  0  V  to  e  I-tt.kJ 


f3B.l] 


A  necessar>'  condition  for  (  3B.1)  to  hold  will  now  be  derived. 

Using  the  definition  of  H(z'l),  (  3B.1)  becomes 

Re{ — - - 4]  >0  V  cos  [-71,71]  (3B.2) 

C(e-J“)  2  J 

Now 

Re{  — 5 - 4]  =  yRe  (  1-Clg'-”  } 

C(e-J“)  -  -  l+cie-J“  +C2e-2J“  -  ..  +Cre-''J“ 

1  f  1  -  c,  cosco-Ct  cos2co-..-c, cos  r(0  +  j(Ci  sino)  +  Ct  sin2to+..+c,  sin  rco  i 

=  —  Re- - i - = - : - ^ i ^ - > 

2  ^  1  +  C[  COSCO  +  Co  cos2co+..+CjCOS  rco  -  j(C]  sinco  +  C2  sin2co+..+c,  sin  rco  I 


2  1  JcCco) 


Rationalizing  the  above  expression  and  taking  the  real  pan  yields 

l~l  r 

Re- - ^ — I  1  “(c?  +c;+..+  Cf  ]-22^  ^c,c ,  cos(,/  -  t  )co  I  (3B.3  ) 

/=1 7^2  '  J 

”1 

r-1 

^^co)^  c?)-  S^yCosyco  j  (3B.4) 


where  a:(  co)  is  positive  function  of  the  c's  and  co,  and 

i-j 


a:,  =2^0,. 


i+; 


1=1 


(3B.5i 


The  SPR  condition  (  3B.2)  then  implies 


i~i 


>fcf  +C3+..+  Cr  )  +  ^X^COS,/CO  VCOSj-TC.Tt] 
/=i 


111 


112 


Now  define 


(CO)  =  ^ cosy'ci)  Vfoe[-7r,7c] 


K  71 

I  Ar.i(u)  do)  =  y  A',  I  cosya)da)  =  0 

J—n  •'  J— 71 

J=l 

Hence  r\r-i(col  cannot  hcvc  the  same  sign  for  all  o)  e  So  for  some  o).  Ar-ilco) 


will  be  non-negative,  and  thus 


Cj"  <  1 


Thus  tne  SPR  condition  implies  that  the  h  norm  of  the  impulse  response  of  the  filter 
Cfz'l  )  =  1  +  ciz‘^  -t-  C2  z'2  +  ..  CrZ'^'  ,  is  upper  bounded  by  2,  or  in  other  words,  the 
coefficient  vector  [ci,  C2.  ••  Cr]^  lies  in  the  unit  sphere  centered  at  the  origin  of  r- 
dimensional  Euclidean  space. 

Next,  it  is  shown  that  the  condition  (3.3.10b)  implies  that  the  SPR  condition  (  3B.2) 


will  hold.  It  can  be  seen  from  (  3B.3)  that 

Rel— i— -iU-L  i-(c2+c?+..+  c?)-2y  Tic.l  Ic, 


._1U^ 


ae'J“^)  2J  .rlco) 


(=1;=2 


Thus  i  3B.2)  will  hold  if 


l-(cf +c5+..+  c;)-2X2^Ic,I  Ic^l  >0 

.=1  j=2 


i.e.  if 


1  -  (lCil-(-IC2l+..+  ICflj  >  0 

r 

Thus  ^Ic.l  <  1  implies  that  the  SPR  condition  (  3B. 2)  is  true. 


APPENDIX  4A 


Summability  of  the  Weighting  Factors  of  the  DHOBE  Algorithm 


Define 


q,t 


t 

X\  fjd- A.'j)  if  i  <  t 


if  i  =  t 


(4A.1 ) 


Then  the  term  in  square  brackets  in  (4.4.32)  can  be  defined  as 

t 

R(t)=  X^jt 

J=lo  +  l 

And 

R(t)  =  (1- /;,)Rd-l)  +  (4A.3) 


Nov*. 


R(tt)+1 )  =  <1 


Assume 

R(t-l)  <  1 
Then  by  (4A.3) 


R(t)  <  (1-  X',).l  +  k\ 


i.e. 


R(t)  <  1 
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